Bug 9732 - [SGEN] Assertion causes some programs to crash reliably on FreeBSD
Summary: [SGEN] Assertion causes some programs to crash reliably on FreeBSD
Alias: None
Product: Runtime
Classification: Mono
Component: GC ()
Version: unspecified
Hardware: PC Other
: --- normal
Target Milestone: ---
Assignee: Bugzilla
Depends on:
Reported: 2013-01-23 09:06 UTC by jack.pappas
Modified: 2013-02-08 12:05 UTC (History)
4 users (show)

Is this bug a regression?: ---
Last known good build:

Build log containing sgen GC assertion message (4.66 KB, text/plain)
2013-01-23 09:06 UTC, jack.pappas
Mono debug log with assertion message and GC statistics (344.57 KB, application/octet-stream)
2013-02-04 12:20 UTC, jack.pappas

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description jack.pappas 2013-01-23 09:06:30 UTC
Created attachment 3267 [details]
Build log containing sgen GC assertion message

FIRST, some background:
I'm running FreeBSD 9-STABLE x86 under VirtualBox (Win7 x64 host, VM has 1 core and 1024MB RAM), and just upgraded to the latest Mono release (v3.0.3). FWIW, running `uname -a` gives me: FreeBSD jack-bsd 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r245748M: Mon Jan 21 20:56:47 EST 2013     root@jack-bsd:/usr/obj/usr/src/sys/GENERIC  i386

Until a week ago, the most current version of Mono available for FreeBSD (via the FreeBSD Ports Collection) was Mono 2.11.8, and only the old Boehm-style GC was available. I found that FreeBSD now had proper support for the __thread modifier needed to build Mono's new 'sgen' GC, so I grabbed the v3.0.3 sources, tweaked the configuration settings and was able to build Mono with the sgen GC enabled. I ran some simple programs, and everything looked good to go...

I downloaded the sources for the most-recent tag (3.0.25) of F# from GitHub, and tried to build it. The very first project to be built (FSharp.Build-proto) crashes (core dumps) almost immediately once the F# compiler is invoked. I've attached a log with the exact message, but the key part is this:
* Assertion at sgen-gc.c:4129, condition `stack_end < info->stack_end' not met

This crash is deterministic -- I've tried rebuilding several times and it crashes in the exact same way each time.

To save you a little time, I peeked at my copy of the source file in question and found that the assertion is tripped within the 'mono_gc_set_stack_end' function in mono/metadata/sgen-gc.c. Interestingly, there's a FIXME comment in the 'sgen_thread_unregister' function which describes a probable cause for the assertion.

Please have a look at this when you get a chance -- I'm really looking forward to having a current version of Mono running on FreeBSD so I'm not stuck using Linux anymore.

Comment 1 jack.pappas 2013-01-23 09:18:44 UTC
Oh, one other thing I forgot to mention -- when I was tracking down the problem, I tried setting the MONO_LOG_LEVEL environment variable to 'debug' (on FreeBSD that's `set MONO_LOG_LEVEL=debug`) and re-running the build. For whatever reason, setting this variable doesn't print any logging information to the console (it should?), but it does seem to alter the race condition -- the FSharp.Build-proto assembly builds successfully, as do a few other projects. However, another project in the solution eventually fails with the exact same assertion message, so this can't be used as a reliable workaround.
Comment 2 Mark Probst 2013-01-24 12:32:46 UTC
Can you reproduce this bug on a supported platform (eg Linux)?
Comment 3 jack.pappas 2013-01-25 09:43:17 UTC
Hi Mark,

I haven't had a chance to try yet, but I'll give it a shot over the weekend or next week and let you know if I have any luck.

Comment 4 jack.pappas 2013-02-04 12:20:14 UTC
Created attachment 3328 [details]
Mono debug log with assertion message and GC statistics
Comment 5 jack.pappas 2013-02-04 12:22:25 UTC
Hi All,

Over the weekend, I pulled/built/installed the latest Mono sources from GitHub onto a VirtualBox VM with the same settings as my FreeBSD VM, only running Ubuntu 12.04.1 32-bit. I was able to compile the F# compiler and libraries on Ubuntu without issue, so it appears this problem doesn't affect Linux -- though I suspect this is probably because the sgen-gc.h file makes use of some Linux-specific (i.e., non-POSIX) pthread functions.

I don't have a Mac so I can't try this on Mac OS X, though I did come across a Gist (https://gist.github.com/4458568) from Rodrigo Kumpera which demonstrates (on Mac OS X) an issue possibly related to this.

I've attached a log of the output from running "make" on the F# sources on FreeBSD, where I've set MONO_LOG_LEVEL to "debug". Note that just before the assertion is triggered and the process crashes, there's a message "GC_MAJOR: (mature allocation failure)". If you look near the top of Rodrigo's Gist, you'll see the same message.

I hope this helps narrow down the cause of the problem. If not, I'm happy to provide more logs, run test builds, etc., to try and gather useful information.

Comment 6 Mark Probst 2013-02-04 18:46:45 UTC
Rodrigo, can you comment on this?
Comment 7 Rodrigo Kumpera 2013-02-05 08:49:54 UTC
Mark, he's trying the mach backend with __thread enabled, something we're yet to support since OSX still  doesn't have it.

F# builds fine with sgen on OSX, so this is really an issue that falls out of our hands. We'll merge patches for an eventual fix, but that's all.
Comment 8 Mark Probst 2013-02-05 14:57:39 UTC
Ok, so it'll be fixed eventually.
Comment 9 Rodrigo Kumpera 2013-02-05 16:28:51 UTC
The latest clang has support for __thread on OSX and looks like they might make it into a stable ABI.

So, we might improve the mach backend enough to make it easier to handle FreeBSD.
Comment 10 jack.pappas 2013-02-07 09:45:39 UTC
Rodrigo -- FreeBSD uses the POSIX backend, not the Mach backend (which is just for OS X). Mac OS X does contain a large chunk of the FreeBSD and OpenBSD codebases though :)

LLVM 3.2 introduced support for thread-local storage (TLS), and Clang 3.2 is able to take advantage of that via the __thread qualifier (just like GCC).

As far as I've been able to tell, this issue isn't a compiler/platform issue -- it's just a race condition which happens to manifest itself on FreeBSD. Even then, I am able to run some programs with mono-sgen -- only some programs (the F# compiler, for one) trigger the race and crash.
Comment 11 jack.pappas 2013-02-08 09:31:17 UTC
Hi All -- I figured out what the problem was here: when sgen was compiled on BSD, it was using the code in the #else section in sgen_thread_register (in metadata/sgen-gc.c). For whatever reason, that code sometimes works but crashes under certain GC conditions.

I patched the code to check for the BSD-specific thread-stack API (it uses pthread_attr_get_np instead of Linux's pthread_getattr_np) and was able to compile F# without any problems.

I've sent the modifications on GitHub, in pull request #551: https://github.com/mono/mono/pull/551

Comment 12 Rodrigo Kumpera 2013-02-08 12:05:36 UTC
I merged your pull request. But I need you to release that patch under the MIT/X11 license. The way to do to it now is simply state in on the pull request.