Bug 12635 - Tail calls inside continuations cause StackOverflow
Summary: Tail calls inside continuations cause StackOverflow
Status: NEW
Alias: None
Product: Runtime
Classification: Mono
Component: JIT (show other bugs)
Version: unspecified
Hardware: PC Mac OS
: Normal normal
Target Milestone: ---
Assignee: Bugzilla
Depends on:
Reported: 2013-06-11 13:49 UTC by Natallie
Modified: 2016-02-17 16:26 UTC (History)
10 users (show)

See Also:
Is this bug a regression?: ---
Last known good build:


Description Natallie 2013-06-11 13:49:31 UTC
The following sample program fails with StackOverflow:

let rec f x cont =
    if x < 0 then 0
    elif x = 0 then cont x
    else f (x-1) (fun x -> f x cont)

f 250000 id

Rewriting the functions so that they all have a single parameter doesn't help here.
Comment 1 Alexander Kyte 2015-03-29 13:19:58 UTC
Using mono 3.12.1, this doesn't stack overflow for me. You might be using an older mono.
Comment 2 Natallie 2015-03-30 16:36:03 UTC
Yes, that was 2 years ago. However, it still doesn't work for me: http://cl.ly/image/0T2U213d0p04
Comment 3 Andrew Browne 2015-08-12 06:33:57 UTC
I can confirm this is still happening in current versions of mono. Although I had to add a zero:

namespace TestProgram

module Test =
    let rec f x cont =
        if x < 0 then 0
        elif x = 0 then cont x
        else f (x-1) (fun x -> f x cont)

    let main args =
        f 2500000 id |> ignore

Resulted in:
Stack overflow: IP: 0x40455ee2, fault addr: 0x7ffcdf437ff8
  at TestProgram.Test.f (int,Microsoft.FSharp.Core.FSharpFunc`2<int, int>) <0x00034>
  at TestProgram.Test.main (string[]) <0x00027>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_int_object (object,intptr,intptr,intptr) <0xffffffff>

My mono version:
mono --version
Mono JIT compiler version 4.0.2 (Stable Wed Jun 24 10:04:37 UTC 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen
Comment 4 Andrew Browne 2015-08-12 18:31:17 UTC
After reviewing the code again I don't think this code is tail recursive. The f x cont case is not in tail position.
Comment 5 Andrew Browne 2015-08-12 18:31:45 UTC
I have also confirmed that this sample code crashes under the Microsoft CLR as well.
Comment 6 Natallie 2015-08-14 16:10:55 UTC
@Andrew: make sure you compile the code with tail calls enabled, then you can verify that the .tail instruction is there. I don't have a windows machine around to check, but it used to work with .NET framework.
Comment 7 Andrew Browne 2015-08-14 23:52:49 UTC
@Natallie. Sorry tried again on a different machine and couldn't get it to fail. So it looks like it does work on Microsoft CLR. When I'm back at work next week I'll try and replicate the failure again but maybe I did have tail calls off.

c:\temp>fsc Test.fs
Microsoft (R) F# Compiler version 12.0.30815.0
Copyright (c) Microsoft Corporation. All Rights Reserved.


Comment 8 2fdu67+4zvrjs18yy4n8 2015-12-11 00:25:04 UTC
I would say this is not a bug.

(fun x -> f x cont)

captures cont from each call frame (correctly) and each cont closure references the cont from the previous call frame, so f is indeed tail recursive, but you are passing an ever growing chain of closures each referencing the previous closure. I believe this is why it overflows (correctly), and is unrelated to tail call optimization.

For an indication of what it is doing, consider:

let rec f x cont =
    printfn "%i" x
    if x < 0 then 0
    elif x = 0 then cont x
    else f (x-1) (fun x -> f x cont)

f 5 id



the bottom 5 zeros are the chain of closures unwinding, and they had to be stored and passed to the terminal x = 0 case before the first in the chain was called.

And to replicate the failure just keep adding zeros to the 250000.

For maximum testing I rewrote this in Scheme and ran it through a few implementations (to see if there was some magic that I was unaware of):

(define (f x cont)
                ((< x 0)
                ((= x 0)
                        (cont x))
                        (f (- x 1) (lambda (x) (f x cont))))))

(f 10000000 values)

Scheme implementations that allocate the closures on the stack fail quickly, implementations that put it on the heap proceed to exhaust available memory.
Comment 9 2fdu67+4zvrjs18yy4n8 2015-12-12 19:35:00 UTC
Also, the same program in ocaml consumes all available memory if given a sufficiently high starting value.
Comment 10 Llewellyn Pritchard 2016-02-17 06:40:55 UTC
The above Scheme program runs just fine using IronScheme on MS CLR.

It does allocate a fair bit on the heap (1.1GB).

Statistics for '(f 10000000 values)':
  Real Time:  15728ms
  CPU Time:   14680ms
  User Time:  15288ms
  GC's:       587

Looking at the mono source code, it appears tail calls are horribly implemented. 

It simply attempts to re-use the caller's stack, and if the caller's stack is smaller than the callee's stack it ignores the tail call.

This is NOT how tail calls are meant to be implemented.
Comment 11 Rodrigo Kumpera 2016-02-17 16:26:06 UTC
tail calls are not mandatory and are subject to implementation restrictions.

We implement enough cases that keep F# happy.

Supporting tail calls that are not call conv equivalent is non-trivial.

Note You need to log in before you can comment on or make changes to this bug.