Bug 14834 - SIGSEGV in mono-sgen caused by __GNU_C__ optimization of OBJ_BITMAP_FOREACH_PTR (__builtin_ctz)
Summary: SIGSEGV in mono-sgen caused by __GNU_C__ optimization of OBJ_BITMAP_FOREACH_P...
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC (show other bugs)
Version: unspecified
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-09-18 13:16 UTC by Andres G. Aragoneses
Modified: 2013-09-20 16:19 UTC (History)
6 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
output when running with some g_print patch from zoltan (173.76 KB, text/plain)
2013-09-20 12:20 UTC, Andres G. Aragoneses
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Andres G. Aragoneses 2013-09-18 13:16:05 UTC
(Even though this bug could be a duplicate of bug 11322 because of the similar stacktrace, it happened to me under much different circumstances -i.e. running a different program-, so I'm filing a new one now.)

If you run banshee (tag 2.6.1 for instance) with today's Mono (cloned from 8288f81dfd645db16f9e89cf1b67e609879c60f9), you get this unmanaged crash:

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) object.__icall_wrapper_mono_gc_alloc_vector (intptr,intptr,intptr) <IL 0x0000f, 0xffffffff>
  at (wrapper alloc) object.AllocVector (intptr,intptr) <IL 0x0007a, 0xffffffff>
  at System.Text.Encoding.GetBytes (string) [0x00036] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System.Text/Encoding.cs:252
  at GLib.Marshaller.StringToPtrGStrdup (string) [0x00018] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Marshaller.cs:158
  at GLib.SignalClosure.Connect (bool) [0x00007] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/SignalClosure.cs:99
  at GLib.Signal.AddDelegate (System.Delegate) [0x00185] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Signal.cs:248
  at GLib.Object.AddSignalHandler (string,System.Delegate,System.Type) [0x00048] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Object.cs:775
  at Gtk.Widget.add_ButtonReleaseEvent (Gtk.ButtonReleaseEventHandler) [0x00012] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/gtk/generated/Widget.cs:1032
  at Hyena.Widgets.GrabHandle..ctor (int,int) [0x00057] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Widgets/GrabHandle.cs:50
  at Hyena.Widgets.GrabHandle..ctor () [0x00000] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Widgets/GrabHandle.cs:38
  at Banshee.Gui.Widgets.ConnectedSeekSlider.BuildSeekSlider (Banshee.Gui.Widgets.SeekSliderLayout) [0x000b6] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui.Widgets/ConnectedSeekSlider.cs:121
  at Banshee.Gui.Widgets.ConnectedSeekSlider..ctor (Banshee.Gui.Widgets.SeekSliderLayout) [0x00034] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui.Widgets/ConnectedSeekSlider.cs:59
  at Banshee.NotificationArea.TrackInfoPopup..ctor () [0x00051] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/TrackInfoPopup.cs:58
  at Banshee.NotificationArea.GtkNotificationAreaBox..ctor (Banshee.Gui.BaseClientWindow) [0x00073] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/GtkNotificationAreaBox.cs:63
  at Banshee.NotificationArea.NotificationAreaService.BuildNotificationArea () [0x0002a] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:237
  at Banshee.NotificationArea.NotificationAreaService.ServiceStartup () [0x00037] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:112
  at Banshee.NotificationArea.NotificationAreaService.Banshee.ServiceStack.IExtensionService.Initialize () [0x00018] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:87
  at Banshee.ServiceStack.ServiceManager.StartExtension (Mono.Addins.TypeExtensionNode) [0x0003c] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/ServiceManager.cs:210
  at Banshee.ServiceStack.ServiceManager.Run () [0x00093] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/ServiceManager.cs:144
  at Banshee.ServiceStack.Application.Run () [0x0002c] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/Application.cs:106
  at Banshee.Gui.GtkBaseClient.Initialize (bool) [0x00142] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:199
  at Banshee.Gui.GtkBaseClient..ctor (bool,string) [0x00017] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:92
  at Banshee.Gui.GtkBaseClient..ctor () [0x00002] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:84
  at Nereid.Client..ctor () <IL 0x00000, 0x0000f>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <IL 0x0004e, 0xffffffff>
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Reflection.MonoCMethod.InternalInvoke (System.Reflection.MonoCMethod,object,object[],System.Exception&) <IL 0x0001c, 0xffffffff>
  at System.Reflection.MonoCMethod.InternalInvoke (object,object[]) [0x00002] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System.Reflection/MonoMethod.cs:537
  at System.Activator.CreateInstance (System.Type,bool) [0x000af] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System/Activator.cs:329
  at System.Activator.CreateInstance (System.Type) [0x00000] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System/Activator.cs:222
  at Banshee.Gui.GtkBaseClient.Startup () [0x00001] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:79
  at Hyena.Gui.CleanRoomStartup.Startup (Hyena.Gui.CleanRoomStartup/StartupInvocationHandler) [0x00050] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Gui/CleanRoomStartup.cs:54
  at Banshee.Gui.GtkBaseClient.Startup<T> () [0x00049] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:74
  at Banshee.Gui.GtkBaseClient.Startup<T> (string[]) [0x00021] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:64
  at Nereid.Client.Main (string[]) [0x00002] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Clients/Nereid/Nereid/Client.cs:54
  at (wrapper runtime-invoke) <Module>.runtime_invoke_void_object (object,intptr,intptr,intptr) <IL 0x00050, 0xffffffff>

Native stacktrace:

	/opt/mono/bin/mono() [0x4ab821]
	/opt/mono/bin/mono() [0x5029ff]
	/opt/mono/bin/mono() [0x420837]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0xfbd0) [0x2aaaab3f0bd0]
	/opt/mono/bin/mono() [0x5efa8e]
	/opt/mono/bin/mono() [0x5cbd87]
	/opt/mono/bin/mono() [0x5d16e6]
	/opt/mono/bin/mono() [0x5d1ca8]
	/opt/mono/bin/mono() [0x5e801a]
	/opt/mono/bin/mono() [0x5e822b]
	[0x40431830]

Debug info from gdb:

Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No threads.

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

/bin/bash: line 1: 29836 Aborted                 (core dumped) /opt/mono/bin/mono --debug Nereid.exe mono --debug --uninstalled
make: *** [run] Error 134
Comment 1 Andres G. Aragoneses 2013-09-18 17:31:14 UTC
Running inside GDB:

[New Thread 0x7fffbf978700 (LWP 17333)]

Program received signal SIGSEGV, Segmentation fault.
0x00000000005efa8e in simple_nursery_serial_scan_object (start=0x7fffd50a4400 "\310M\263\001", queue=0x96bf60 <gray_queue>) at sgen-scan-object.h:71
71			SCAN;
Comment 2 Andres G. Aragoneses 2013-09-18 17:37:09 UTC
(gdb) thread apply all bt

Thread 10 (Thread 0x7fffbf978700 (LWP 17333)):
#0  0x00007ffff74ba071 in sem_timedwait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000000623c4b in mono_sem_timedwait (sem=sem@entry=0x96b268 <async_tp+40>, timeout_ms=timeout_ms@entry=2000, alertable=alertable@entry=1)
    at mono-semaphore.c:79
#2  0x0000000000588402 in async_invoke_thread (data=0x0, data@entry=0x96b240 <async_tp>) at threadpool.c:1565
#3  0x0000000000583ac2 in start_wrapper_internal (data=0x7fffcc0025f0) at threads.c:608
#4  start_wrapper (data=0x7fffcc0025f0) at threads.c:653
#5  0x0000000000618501 in thread_start_routine (args=args@entry=0x9c4ce8) at wthreads.c:294
#6  0x0000000000628450 in inner_start_thread (arg=0x7fffcc002660) at mono-threads-posix.c:49
#7  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#8  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 9 (Thread 0x7fffd540f700 (LWP 17331)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd540dd40, info=0x7fffc00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd540dd40) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b7ca2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000601a4b in _wapi_handle_timedwait_signal_handle (handle=0x400, timeout=timeout@entry=0x0, alertable=alertable@entry=1, poll=poll@entry=0)
    at handles.c:1588
#6  0x0000000000601ab5 in _wapi_handle_wait_signal (poll=poll@entry=0) at handles.c:1521
#7  0x0000000000616039 in WaitForMultipleObjectsEx (numobjects=numobjects@entry=2, handles=handles@entry=0x7fffd540e6e0, waitall=waitall@entry=0, 
---Type <return> to continue, or q <return> to quit---
    timeout=timeout@entry=4294967295, alertable=alertable@entry=1) at wait.c:668
#8  0x0000000000581c0d in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fb870, multiple=multiple@entry=1, numhandles=numhandles@entry=2, 
    handles=handles@entry=0x7fffd540e6e0, waitall=waitall@entry=0, ms=ms@entry=-1, alertable=1) at threads.c:1488
#9  0x00000000005834e0 in ves_icall_System_Threading_WaitHandle_WaitAny_internal (mono_handles=0x7ffff657c728, ms=-1, exitContext=<optimized out>)
    at threads.c:1586
#10 0x00000000401911b0 in ?? ()
#11 0x00007fffc0002540 in ?? ()
#12 0x00007fffd540ed18 in ?? ()
#13 0x00007ffff66b1d78 in ?? ()
#14 0xffffffffffffffff in ?? ()
#15 0x00007fffc0002630 in ?? ()
#16 0x00007fffd540e9c0 in ?? ()
#17 0x00007fffd540e920 in ?? ()
#18 0x00007fffd540ed18 in ?? ()
#19 0x00007ffff66b1d78 in ?? ()
#20 0xffffffffffffffff in ?? ()
#21 0x00007ffff657c728 in ?? ()
#22 0x00007ffff45afc1e in System.Threading.WaitHandle:WaitAny (waitHandles=0x0, timeout=-10000, exitContext=false) at <unknown>:228
#23 0x00007ffff45a98a1 in System.Threading.RegisteredWaitHandle:Wait (this=..., state=<optimized out>) at <unknown>:75
#24 0x0000000040013ba1 in ?? ()
#25 0x0000000000000005 in ?? ()
#26 0x0000000000000000 in ?? ()

---Type <return> to continue, or q <return> to quit---
Thread 8 (Thread 0x7fffd5450700 (LWP 17330)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd544f740, info=0x7fffcc0008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd544f740) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74bb43d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00000000006177e7 in SleepEx (ms=ms@entry=500, alertable=alertable@entry=1) at wthreads.c:842
#6  0x00000000005857f3 in monitor_thread (unused=unused@entry=0x0) at threadpool.c:779
#7  0x0000000000583ac2 in start_wrapper_internal (data=0x7fffd0003080) at threads.c:608
#8  start_wrapper (data=0x7fffd0003080) at threads.c:653
#9  0x0000000000618501 in thread_start_routine (args=args@entry=0x9c0698) at wthreads.c:294
#10 0x0000000000628450 in inner_start_thread (arg=0x7fffd0003110) at mono-threads-posix.c:49
#11 0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 7 (Thread 0x7fffd5e75700 (LWP 17329)):
#0  0x00007ffff74b805e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fffee747935 in g_cond_wait_until () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee6ddb81 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffee6de1ca in g_async_queue_timeout_pop () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007fffee72c6b2 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#7  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 6 (Thread 0x7fffd66aa700 (LWP 17328)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd66a9140, info=0x7fffd00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd66a9140) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b805e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x000000000060182d in _wapi_handle_timedwait_signal_handle (handle=handle@entry=0x442, timeout=timeout@entry=0x7fffd66a97d0, 
    alertable=alertable@entry=1, poll=poll@entry=0) at handles.c:1586
#6  0x0000000000615446 in WaitForSingleObjectEx (handle=0x442, timeout=timeout@entry=99, alertable=alertable@entry=1) at wait.c:198
#7  0x0000000000581c6f in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fbb30, multiple=multiple@entry=0, numhandles=numhandles@entry=1, 
    handles=handles@entry=0x7fffd66a98c8, waitall=waitall@entry=0, ms=ms@entry=99, alertable=1) at threads.c:1490
#8  0x00000000005833f9 in ves_icall_System_Threading_WaitHandle_WaitOne_internal (this=<optimized out>, handle=0x442, ms=99, exitContext=<optimized out>)
    at threads.c:1622
#9  0x00000000400b0518 in ?? ()
#10 0x00007fffd0002540 in ?? ()
#11 0x0000000000000038 in ?? ()
#12 0x00007fffd66a9987 in ?? ()
#13 0x00007ffff673a018 in ?? ()
#14 0x0000000000000038 in ?? ()
#15 0x00007fffd66a9990 in ?? ()
#16 0x00007fffd66a98f0 in ?? ()
---Type <return> to continue, or q <return> to quit---
#17 0x00000000013561e8 in ?? ()
#18 0x0000000000000063 in ?? ()
#19 0x00007ffff673a3b0 in ?? ()
#20 0x0000000000000063 in ?? ()
#21 0x00007ffff45b0081 in System.Threading.WaitHandle:WaitOne (this=..., millisecondsTimeout=0, exitContext=false) at <unknown>:382
#22 0x00007ffff45b010c in System.Threading.WaitHandle:WaitOne (this=..., millisecondsTimeout=99) from /opt/mono/lib/mono/4.5/mscorlib.dll.so
#23 0x00007ffff45af405 in System.Threading.Timer/Scheduler:SchedulerThread (this=...) at <unknown>:385
#24 0x00007ffff45abe49 in System.Threading.Thread:StartInternal (this=...) at <unknown>:682
#25 0x000000004001640f in ?? ()
#26 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7fffe453a700 (LWP 17327)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffe4539240, info=0x7fffdc0008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffe4539240) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b7ca2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000601a4b in _wapi_handle_timedwait_signal_handle (handle=handle@entry=0x40c, timeout=timeout@entry=0x0, alertable=alertable@entry=1, 
    poll=poll@entry=0) at handles.c:1588
#6  0x0000000000601a7b in _wapi_handle_wait_signal_handle (handle=handle@entry=0x40c, alertable=alertable@entry=1) at handles.c:1533
#7  0x000000000061555d in WaitForSingleObjectEx (handle=0x40c, timeout=timeout@entry=4294967295, alertable=alertable@entry=1) at wait.c:196
#8  0x0000000000581c6f in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fbc90, multiple=multiple@entry=0, numhandles=numhandles@entry=1, 
    handles=handles@entry=0x7fffe4539998, waitall=waitall@entry=0, ms=ms@entry=-1, alertable=1) at threads.c:1490
---Type <return> to continue, or q <return> to quit---
#9  0x00000000005833f9 in ves_icall_System_Threading_WaitHandle_WaitOne_internal (this=<optimized out>, handle=0x40c, ms=-1, exitContext=<optimized out>)
    at threads.c:1622
#10 0x00000000400b0518 in ?? ()
#11 0x00007fffdc002540 in ?? ()
#12 0x00007fffe4539a30 in ?? ()
#13 0x00007fffe4539a4f in ?? ()
#14 0x00007ffff6665ac8 in ?? ()
#15 0x00007fffdc0025d0 in ?? ()
#16 0x00007fffe4539a50 in ?? ()
#17 0x00007fffe45399c0 in ?? ()
#18 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7fffe520a700 (LWP 17326)):
#0  0x00007ffff71d13cd in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fffee7081dc in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee7086ba in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffeecda4f6 in ?? () from /usr/lib/x86_64-linux-gnu/libgio-2.0.so.0
#4  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7fffe5a0b700 (LWP 17325)):
#0  0x00007ffff71d13cd in poll () from /lib/x86_64-linux-gnu/libc.so.6
---Type <return> to continue, or q <return> to quit---
#1  0x00007fffee7081dc in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee708304 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffe62aca1d in ?? () from /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
#4  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 2 (Thread 0x7ffff4314700 (LWP 17324)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7ffff4313780, info=0x7ffff00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7ffff4313780) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b9f7e in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000623b68 in mono_sem_wait (sem=sem@entry=0x96b600 <finalizer_sem>, alertable=alertable@entry=1) at mono-semaphore.c:116
#6  0x000000000059fef5 in finalizer_thread (unused=unused@entry=0x0) at gc.c:1073
#7  0x0000000000583ac2 in start_wrapper_internal (data=0xa1cd70) at threads.c:608
#8  start_wrapper (data=0xa1cd70) at threads.c:653
#9  0x0000000000618501 in thread_start_routine (args=args@entry=0x9bd560) at wthreads.c:294
#10 0x0000000000628450 in inner_start_thread (arg=0xa1ca50) at mono-threads-posix.c:49
#11 0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 1 (Thread 0x7ffff7fd97c0 (LWP 17320)):
---Type <return> to continue, or q <return> to quit---
#0  0x00000000005efa8e in simple_nursery_serial_scan_object (start=0x7fffd50a4400 "\310M\263\001", queue=0x96bf60 <gray_queue>) at sgen-scan-object.h:71
#1  0x00000000005cbd87 in sgen_drain_gray_stack (max_objs=max_objs@entry=-1, ctx=...) at sgen-gc.c:1192
#2  0x00000000005d16e6 in collect_nursery (finish_up_concurrent_mark=0, unpin_queue=0x0) at sgen-gc.c:2611
#3  collect_nursery (unpin_queue=0x0, finish_up_concurrent_mark=0) at sgen-gc.c:2483
#4  0x00000000005d1ca8 in sgen_perform_collection (requested_size=2944, generation_to_collect=0, reason=<optimized out>, wait_to_finish=<optimized out>)
    at sgen-gc.c:3445
#5  0x00000000005e7f54 in mono_gc_alloc_obj_nolock (vtable=vtable@entry=0xa079c8, size=size@entry=2944) at sgen-alloc.c:264
#6  0x00000000005e811b in mono_gc_alloc_string (vtable=0xa079c8, size=2944, len=1458) at sgen-alloc.c:563
#7  0x0000000040015b40 in ?? ()
#8  0x00007fffffffd8a0 in ?? ()
#9  0x00000000009bcf90 in ?? ()
#10 0x0000000000000516 in ?? ()
#11 0x447974403058bea8 in ?? ()
#12 0x00007ffff6800008 in ?? ()
#13 0x0000000000000b80 in ?? ()
#14 0x00007fffffffc8a0 in ?? ()
#15 0x00007ffff67ff488 in ?? ()
#16 0x00007ffff7fd9798 in ?? ()
#17 0x0000000000a079c8 in ?? ()
#18 0x00000000000005b2 in ?? ()
#19 0x0000000040014fa4 in ?? ()
#20 0x00007ffff69d8100 in ?? ()
#21 0x00007ffff67ff458 in ?? ()
---Type <return> to continue, or q <return> to quit---
#22 0x0000000000000000 in ?? ()
Comment 3 Andres G. Aragoneses 2013-09-19 20:01:16 UTC
> Mark Probst <mark@xamarin.com> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>                 CC|                            |mark@xamarin...

Hey Mark! do you reproduce this?

If not, I need some help, from the unmanaged stacktrace I guess the problem is in sgen-scan-object.h, but line 71 of that file is just a call to "SCAN", which two lines before seems to be defined as OBJ_BITMAP_FOREACH_PTR (desc, start).

As the last frame of the trace is said to be inside the function simple_nursery_serial_scan_object, I was expecting that the macro OBJ_BITMAP_FOREACH_PTR was calling this function, but it is not. This macro is defined in sgen-descriptor.h (line 174 or 195 depending if __GNUC__ is defined or not, which I guess in my case it is because I'm in Linux, 64bits, and not using clang or anything...).

I've grepped the name simple_nursery_serial_scan_object and it only comes up in sgen-minor-scan-object.h as the meaning of the SERIAL_SCAN_OBJECT macro, but HANDLE_PTR doesn't call this macro in this file, so I'm a bit lost.

Any hint you could give me? Thanks
Comment 4 Andres G. Aragoneses 2013-09-19 20:04:03 UTC
(Oh, forgot to say, OBJ_BITMAP_FOREACH_PTR uses HANDLE_PTR and OBJECT_HEADER_WORDS macros.) It seems that having so many macros around makes it really hard to track what happened from a stacktrace :(
Comment 5 Zoltan Varga 2013-09-19 20:55:12 UTC
Most sgen bugs show up as crashes in sgen_scan_object () , the problem is probably not in that function.
Comment 6 Andres G. Aragoneses 2013-09-20 07:27:57 UTC
(In reply to comment #5)
> Most sgen bugs show up as crashes in sgen_scan_object () , the problem is
> probably not in that function.

Ok, I've filled those macros with good old printfs, and I think I've located the place where it crashes, it is this line:

https://github.com/mono/mono/blob/master/mono/metadata/sgen-minor-scan-object.h#L94

Does it make sense? How can I further debug it?
Comment 7 Andres G. Aragoneses 2013-09-20 08:34:34 UTC
(In reply to comment #6)
> Ok, I've filled those macros with good old printfs, and I think I've located
> the place where it crashes, it is this line:
> https://github.com/mono/mono/blob/master/mono/metadata/sgen-minor-scan-object.h#L94
> Does it make sense? How can I further debug it?

I have added a check for NULL on ptr before doing that assignment, and it is not NULL. I have also tried adding an explicit (void*) cast, but it doesn't help.

So I have no idea what else could it be causing a SIGSEGV on that line. Maybe ptr is pointing to garbage (already deallocated memory)? How could I check this?

Any pointers? Thank you.
Comment 8 Zoltan Varga 2013-09-20 08:41:53 UTC
As I said, the problem is not in that function but somewhere else. GC bugs are pretty hard to track down and fix.
Comment 9 Andres G. Aragoneses 2013-09-20 09:53:51 UTC
(In reply to comment #8)
> As I said, the problem is not in that function but somewhere else. GC bugs are
> pretty hard to track down and fix.

Gotcha, that's why I'm asking for help to track it down :)

Anyway I think I did some progress, because I managed to find a workaround that works. I call it "workaround" instead of "fix", because there could be possibly a better fix than this, as what I'm basically doing is removing a piece of code which seems to be a __GNUC__ optimization. But if nobody manages to fix an optimization, it's better to have slower code than have buggy code, right? If you agree, then just merge my pull request :) 

https://github.com/mono/mono/pull/764

(For the record, this is the printf-diff that I used to debug this: https://gist.github.com/knocte/6637263 )
Comment 10 Zoltan Varga 2013-09-20 10:32:23 UTC
If this change really fixes the crash, the problem might be easier to track down that the usual GC bugs.
Could you add an 
print ("%d %lx\n", _index, (long int)_bmap); \
after both of the
			int _index = __builtin_ctz (_bmap); \
lines, and attach the output ?

Also, what architecture is this and what gcc version ? Does compiling the runtime with -O0 fixes this issue  ?
Comment 11 Rodrigo Kumpera 2013-09-20 11:02:46 UTC
It might really be the case that btz generates bad code on 64bits.
Comment 12 Andres G. Aragoneses 2013-09-20 12:20:17 UTC
Created attachment 4943 [details]
output when running with some g_print patch from zoltan

(In reply to comment #10)
> If this change really fixes the crash, the problem might be easier to track
> down that the usual GC bugs.

Great!


> Could you add an ... and attach the output?

Sure, here you have the output. (The last lines prints a negative index, which smells like the culprit.)


> Also, what architecture is this and what gcc version ?

Linux PC 64bits (ubuntu 13.04), gcc (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3


> Does compiling the runtime with -O0 fixes this issue  ?

configured via CFLAGS=-O0 ./autogen.sh --prefix=/opt/mono and seems to fix it as well.
Comment 13 Andres G. Aragoneses 2013-09-20 12:24:35 UTC
BTW I'm renaming the summary given that we have more or less found the culprit.

These 2 commits I guess are also very related to this:

https://github.com/knocte/mono/commit/18fe1394470136e7f6ac7ed0728e8c2976221657

https://github.com/knocte/mono/commit/a698aba8f8ec599b848b4695c06c6f2ec7eb301b
Comment 14 Mark Probst 2013-09-20 12:51:42 UTC
We don't handle the case where _bmap is 0.  GCC's documentation says about __builtin_ctz: If x is 0, the result is undefined.  It seems the unrolled first iteration of the loop must be removed.  Could you try that and see whether it works?
Comment 15 Zoltan Varga 2013-09-20 13:07:50 UTC
Could you try this:
https://github.com/mono/mono/commit/d2cc22580898df5d4a15e0f99ab513e1570a6082
Comment 16 Andres G. Aragoneses 2013-09-20 16:19:16 UTC
(In reply to comment #10)
> Could you try this: ...

Thanks Zoltan, that worked! So I'm closing this as FIXED.

BTW, I also just proposed a pull request to improve things slightly (avoiding some redundancy): https://github.com/mono/mono/pull/765