Bug 23714 - Mono 3.10: runtime segfault when running Nancy testsuite
Summary: Mono 3.10: runtime segfault when running Nancy testsuite
Status: CONFIRMED
Alias: None
Product: Runtime
Classification: Mono
Component: GC (show other bugs)
Version: unspecified
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-10-09 11:53 UTC by Alexander Köplinger
Modified: 2017-09-06 14:54 UTC (History)
6 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Alexander Köplinger 2014-10-09 11:53:03 UTC
I get a segmentation fault when running the Nancy (https://github.com/NancyFx/Nancy) testsuite. It's actually quite hard to reproduce, as I need to run the testsuite in a loop for about ~3 hours to make the runtime crash. GDB output follows, please let me know if you need more info.

System: Ubuntu 14.04 64bit, Mono 3.10 from Xamarin packages

Native stacktrace:

	mono() [0x4b3f7c]
	mono() [0x50c30f]
	mono() [0x423637]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f91976ac340]
	mono() [0x5cd537]
	mono() [0x5ce17b]
	mono() [0x5cfed6]
	mono() [0x5d0483]
	mono() [0x5d3957]
	mono(mono_gc_collect+0x28) [0x5d3fe8]
	mono() [0x59caca]
	mono() [0x631c56]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f91976a4182]
	/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f91973d0fbd]

Debug info from gdb:

[New LWP 57763]
[New LWP 57760]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f919730cf52 in do_sigsuspend (set=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
31	../sysdeps/unix/sysv/linux/sigsuspend.c: No such file or directory.
  Id   Target Id         Frame 
  3    Thread 0x7f9194aea700 (LWP 57760) "Finalizer" 0x00007f919730cf52 in do_sigsuspend (set=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
  2    Thread 0x7f918ffff700 (LWP 57763) "mono" 0x00007f91976abee9 in __libc_waitpid (pid=pid@entry=57764, stat_loc=stat_loc@entry=0x7f919815f19c, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
* 1    Thread 0x7f91981cd7c0 (LWP 57759) "mono" 0x00007f919730cf52 in do_sigsuspend (set=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31

Thread 3 (Thread 0x7f9194aea700 (LWP 57760)):
#0  0x00007f919730cf52 in do_sigsuspend (set=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
#1  __GI___sigsuspend (set=set@entry=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#2  0x00000000005cac04 in suspend_thread (context=0x7f9194ae9800, info=0x7f91900008e0) at sgen-os-posix.c:113
#3  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7f9194ae9800) at sgen-os-posix.c:140
#4  <signal handler called>
#5  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:84
#6  0x000000000062cf48 in mono_sem_wait (sem=sem@entry=0x97aec0 <finalizer_sem>, alertable=alertable@entry=1) at mono-semaphore.c:101
#7  0x00000000005a3c4d in finalizer_thread (unused=<optimized out>) at gc.c:1077
#8  0x0000000000588274 in start_wrapper_internal (data=<optimized out>) at threads.c:660
#9  start_wrapper (data=<optimized out>) at threads.c:707
#10 0x0000000000631c56 in inner_start_thread (arg=0x7fff2298cfa0) at mono-threads-posix.c:84
#11 0x00007f91976a4182 in start_thread (arg=0x7f9194aea700) at pthread_create.c:312
#12 0x00007f91973d0fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f918ffff700 (LWP 57763)):
#0  0x00007f91976abee9 in __libc_waitpid (pid=pid@entry=57764, stat_loc=stat_loc@entry=0x7f919815f19c, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1  0x00000000004b4009 in mono_handle_native_sigsegv (signal=signal@entry=11, ctx=ctx@entry=0x7f919815fac0) at mini-exceptions.c:2323
#2  0x000000000050c30f in mono_arch_handle_altstack_exception (sigctx=sigctx@entry=0x7f919815fac0, fault_addr=<optimized out>, stack_ovf=stack_ovf@entry=0) at exceptions-amd64.c:902
#3  0x0000000000423637 in mono_sigsegv_signal_handler (_dummy=11, info=0x7f919815fbf0, context=0x7f919815fac0) at mini.c:6861
#4  <signal handler called>
#5  sgen_par_object_get_size (o=0x7f918e67d848, vtable=0x0) at ../../mono/metadata/sgen-gc.h:802
#6  sgen_safe_object_get_size (obj=0x7f918e67d848) at ../../mono/metadata/sgen-gc.h:834
#7  sgen_major_is_object_alive (object=0x7f918e67d848) at sgen-gc.c:3302
#8  sgen_is_object_alive_for_current_gen (object=0x7f918e67d848 "") at sgen-gc.c:3337
#9  mark_ephemerons_in_range (ctx=...) at sgen-gc.c:3525
#10 0x00000000005ce17b in finish_gray_stack (generation=generation@entry=1, queue=0x97b7c0 <gray_queue>) at sgen-gc.c:1637
#11 0x00000000005cfed6 in major_finish_collection (reason=0x709bfc "user request", old_next_pin_slot=53, scan_mod_union=0) at sgen-gc.c:2875
#12 0x00000000005d0483 in major_do_collection (reason=<optimized out>) at sgen-gc.c:3016
#13 major_do_collection (reason=0x709bfc "user request") at sgen-gc.c:2998
#14 0x00000000005d3957 in sgen_perform_collection (requested_size=requested_size@entry=0, generation_to_collect=generation_to_collect@entry=1, reason=reason@entry=0x709bfc "user request", wait_to_finish=wait_to_finish@entry=1) at sgen-gc.c:3212
#15 0x00000000005d3fe8 in mono_gc_collect (generation=1) at sgen-gc.c:4332
#16 0x000000000059caca in unload_thread_main (arg=0x2779570) at appdomain.c:2360
#17 0x0000000000631c56 in inner_start_thread (arg=0x7fff2298cb50) at mono-threads-posix.c:84
#18 0x00007f91976a4182 in start_thread (arg=0x7f918ffff700) at pthread_create.c:312
#19 0x00007f91973d0fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f91981cd7c0 (LWP 57759)):
#0  0x00007f919730cf52 in do_sigsuspend (set=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
#1  __GI___sigsuspend (set=set@entry=0x97b3a0 <suspend_signal_mask>) at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#2  0x00000000005cac04 in suspend_thread (context=0x7fff2298c540, info=0x26164d0) at sgen-os-posix.c:113
#3  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fff2298c540) at sgen-os-posix.c:140
#4  <signal handler called>
#5  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#6  0x000000000060bc5b in _wapi_handle_timedwait_signal_handle (handle=handle@entry=0x4c3, timeout=timeout@entry=0x0, alertable=alertable@entry=1, poll=poll@entry=0) at handles.c:1595
#7  0x000000000060bc8b in _wapi_handle_wait_signal_handle (handle=handle@entry=0x4c3, alertable=alertable@entry=1) at handles.c:1540
#8  0x000000000061f50b in WaitForSingleObjectEx (handle=handle@entry=0x4c3, timeout=timeout@entry=4294967295, alertable=alertable@entry=1) at wait.c:194
#9  0x000000000059f5da in mono_domain_try_unload (domain=0x2975610, exc=exc@entry=0x7fff2298cc18) at appdomain.c:2472
#10 0x000000000059f6b7 in mono_domain_unload (domain=<optimized out>) at appdomain.c:2385
#11 0x0000000040b853c2 in ?? ()
#12 0x00000000026bd980 in ?? ()
#13 0x00007fff2298d190 in ?? ()
#14 0x0000000000000000 in ?? ()

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================
Comment 1 Mark Probst 2014-10-14 16:01:31 UTC
Could you check whether that bug also occurs with the Boehm GC?  Use the `mono-boehm` executable instead of `mono` or `mono-sgen`.
Comment 2 Alexander Köplinger 2014-10-15 09:34:00 UTC
Thanks for looking into it. I've ran the testsuite with mono-boehm in a loop for a few hours and there was no segfault yet. It did crash though, but with a different exception.

I'm not terribly sure this is related to the sgen issue, but here's the output. Note the "Finalization of domain 05e26f4a-9a16-4cb1-a297-985c225cd1a1 timed out." message:

xUnit.net console test runner (64-bit .NET 4.0.30319.17020)
Copyright (C) 2007-10 Microsoft Corporation.

xunit.dll:     Version 1.9.1.1600
Test assembly: /home/alexander/dev/Nancy/src/Nancy.Hosting.Aspnet.Tests/bin/MonoRelease/Nancy.Hosting.Aspnet.Tests.dll

3 total, 0 failed, 0 skipped, took 0.594 seconds
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.Threading.ThreadAbortException: Thread was being aborted
  at (wrapper managed-to-native) System.Reflection.MonoCMethod:InternalInvoke (System.Reflection.MonoCMethod,object,object[],System.Exception&)
  at System.Reflection.MonoCMethod.InternalInvoke (System.Object obj, System.Object[] parameters) [0x00000] in <filename unknown>:0 
  --- End of inner exception stack trace ---
  at System.Reflection.MonoCMethod.InternalInvoke (System.Object obj, System.Object[] parameters) [0x00000] in <filename unknown>:0 
  at System.Reflection.MonoCMethod.DoInvoke (System.Object obj, BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00000] in <filename unknown>:0 
  at System.Reflection.MonoCMethod.Invoke (System.Object obj, BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00000] in <filename unknown>:0 
  at System.Reflection.MethodBase.Invoke (System.Object obj, System.Object[] parameters) [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.ObjectRecord.LoadData (System.Runtime.Serialization.ObjectManager manager, ISurrogateSelector selector, StreamingContext context) [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.ObjectManager.DoFixups () [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadNextObject (System.IO.BinaryReader reader) [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadObjectGraph (BinaryElement elem, System.IO.BinaryReader reader, Boolean readHeaders, System.Object& result, System.Runtime.Remoting.Messaging.Header[]& headers) [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.NoCheckDeserialize (System.IO.Stream serializationStream, System.Runtime.Remoting.Messaging.HeaderHandler handler) [0x00000] in <filename unknown>:0 
  at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize (System.IO.Stream serializationStream) [0x00000] in <filename unknown>:0 
  at System.Runtime.Remoting.RemotingServices.DeserializeCallData (System.Byte[] array) [0x00000] in <filename unknown>:0 
  at (wrapper xdomain-invoke) System.Runtime.Remoting.Messaging.IMessageSink:SyncProcessMessage (System.Runtime.Remoting.Messaging.IMessage)
  at Xunit.Sdk.ExecutorCallback+MessageSinkCallback.Notify (System.String value) [0x00000] in <filename unknown>:0 
  at Xunit.Sdk.Executor.OnTestResult (ITestResult result, Xunit.Sdk.ExecutorCallback callback) [0x00000] in <filename unknown>:0 
  at Xunit.Sdk.Executor+RunAssembly+<>c__DisplayClass9.<.ctor>b__4 () [0x00000] in <filename unknown>:0 
  at Xunit.Sdk.Executor.ThreadRunner (System.Object threadStart) [0x00000] in <filename unknown>:0 
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0 
Finalization of domain 05e26f4a-9a16-4cb1-a297-985c225cd1a1 timed out.
System.CannotUnloadAppDomainException: Finalization of domain 05e26f4a-9a16-4cb1-a297-985c225cd1a1 timed out.
  at (wrapper managed-to-native) System.AppDomain:InternalUnload (int)
  at System.AppDomain.Unload (System.AppDomain domain) [0x00000] in <filename unknown>:0 
  at Xunit.ExecutorWrapper.Dispose () [0x00000] in <filename unknown>:0 
  at Xunit.ConsoleClient.Program.RunProject (Xunit.XunitProject project, Boolean teamcity, Boolean silent) [0x00000] in <filename unknown>:0 
  at Xunit.ConsoleClient.Program.Main (System.String[] args) [0x00000] in <filename unknown>:0 
F, [2014-10-15T14:12:01.015199 #6572] FATAL -- : XUnit Failed. See Build Log For Detail


I'll run it again and see if it crashes with the same exception. Is there any way I can gather more useful info for you?
Comment 3 Alexander Köplinger 2014-10-15 10:47:45 UTC
Interestingly, it did crash again after an hour with Boehm and the exact same stack trace (apart from another appdomain guid of course), though in another test assembly so obviously nothing specific to the tests there.
Comment 4 Mark Probst 2015-09-23 12:50:47 UTC
Is this still an issue?
Comment 5 Alexander Köplinger 2015-09-23 12:55:28 UTC
I haven't tried recently and with all the work that happened since 3.10 I doubt I'll get the same repro, so I'm closing this for now. Thanks.
Comment 6 Roman 2016-11-11 13:44:18 UTC
На-ha! I use 4.6.0 version mono and this problem still not resolved! 
Say me does microsoft pay for you in order to that you don't resolve bugs???

PS: C# is not crossplatform language and follows it is not java killer
Comment 7 Rodrigo Kumpera 2016-11-14 20:54:06 UTC
Hey Vlad,

Can you take this one for a spin?

Looks like Roman can still repro it.

Note You need to log in before you can comment on or make changes to this bug.