Bug 14834 - SIGSEGV in mono-sgen caused by __GNU_C__ optimization of OBJ_BITMAP_FOREACH_PTR (__builtin_ctz)
Summary: SIGSEGV in mono-sgen caused by __GNU_C__ optimization of OBJ_BITMAP_FOREACH_P...
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: GC (show other bugs)
Version: unspecified
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2013-09-18 13:16 UTC by Andres G. Aragoneses
Modified: 2013-09-20 16:19 UTC (History)
6 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
output when running with some g_print patch from zoltan (173.76 KB, text/plain)
2013-09-20 12:20 UTC, Andres G. Aragoneses
Details

Description Andres G. Aragoneses 2013-09-18 13:16:05 UTC
(Even though this bug could be a duplicate of bug 11322 because of the similar stacktrace, it happened to me under much different circumstances -i.e. running a different program-, so I'm filing a new one now.)

If you run banshee (tag 2.6.1 for instance) with today's Mono (cloned from 8288f81dfd645db16f9e89cf1b67e609879c60f9), you get this unmanaged crash:

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) object.__icall_wrapper_mono_gc_alloc_vector (intptr,intptr,intptr) <IL 0x0000f, 0xffffffff>
  at (wrapper alloc) object.AllocVector (intptr,intptr) <IL 0x0007a, 0xffffffff>
  at System.Text.Encoding.GetBytes (string) [0x00036] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System.Text/Encoding.cs:252
  at GLib.Marshaller.StringToPtrGStrdup (string) [0x00018] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Marshaller.cs:158
  at GLib.SignalClosure.Connect (bool) [0x00007] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/SignalClosure.cs:99
  at GLib.Signal.AddDelegate (System.Delegate) [0x00185] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Signal.cs:248
  at GLib.Object.AddSignalHandler (string,System.Delegate,System.Type) [0x00048] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/glib/Object.cs:775
  at Gtk.Widget.add_ButtonReleaseEvent (Gtk.ButtonReleaseEventHandler) [0x00012] in /home/knocte/Documents/Code/merging/mono323/gtk-sharp/gtk/generated/Widget.cs:1032
  at Hyena.Widgets.GrabHandle..ctor (int,int) [0x00057] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Widgets/GrabHandle.cs:50
  at Hyena.Widgets.GrabHandle..ctor () [0x00000] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Widgets/GrabHandle.cs:38
  at Banshee.Gui.Widgets.ConnectedSeekSlider.BuildSeekSlider (Banshee.Gui.Widgets.SeekSliderLayout) [0x000b6] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui.Widgets/ConnectedSeekSlider.cs:121
  at Banshee.Gui.Widgets.ConnectedSeekSlider..ctor (Banshee.Gui.Widgets.SeekSliderLayout) [0x00034] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui.Widgets/ConnectedSeekSlider.cs:59
  at Banshee.NotificationArea.TrackInfoPopup..ctor () [0x00051] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/TrackInfoPopup.cs:58
  at Banshee.NotificationArea.GtkNotificationAreaBox..ctor (Banshee.Gui.BaseClientWindow) [0x00073] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/GtkNotificationAreaBox.cs:63
  at Banshee.NotificationArea.NotificationAreaService.BuildNotificationArea () [0x0002a] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:237
  at Banshee.NotificationArea.NotificationAreaService.ServiceStartup () [0x00037] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:112
  at Banshee.NotificationArea.NotificationAreaService.Banshee.ServiceStack.IExtensionService.Initialize () [0x00018] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Extensions/Banshee.NotificationArea/Banshee.NotificationArea/NotificationAreaService.cs:87
  at Banshee.ServiceStack.ServiceManager.StartExtension (Mono.Addins.TypeExtensionNode) [0x0003c] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/ServiceManager.cs:210
  at Banshee.ServiceStack.ServiceManager.Run () [0x00093] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/ServiceManager.cs:144
  at Banshee.ServiceStack.Application.Run () [0x0002c] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.Services/Banshee.ServiceStack/Application.cs:106
  at Banshee.Gui.GtkBaseClient.Initialize (bool) [0x00142] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:199
  at Banshee.Gui.GtkBaseClient..ctor (bool,string) [0x00017] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:92
  at Banshee.Gui.GtkBaseClient..ctor () [0x00002] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:84
  at Nereid.Client..ctor () <IL 0x00000, 0x0000f>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <IL 0x0004e, 0xffffffff>
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Reflection.MonoCMethod.InternalInvoke (System.Reflection.MonoCMethod,object,object[],System.Exception&) <IL 0x0001c, 0xffffffff>
  at System.Reflection.MonoCMethod.InternalInvoke (object,object[]) [0x00002] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System.Reflection/MonoMethod.cs:537
  at System.Activator.CreateInstance (System.Type,bool) [0x000af] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System/Activator.cs:329
  at System.Activator.CreateInstance (System.Type) [0x00000] in /home/knocte/Documents/Code/mono/mcs/class/corlib/System/Activator.cs:222
  at Banshee.Gui.GtkBaseClient.Startup () [0x00001] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:79
  at Hyena.Gui.CleanRoomStartup.Startup (Hyena.Gui.CleanRoomStartup/StartupInvocationHandler) [0x00050] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Hyena/Hyena.Gui/Hyena.Gui/CleanRoomStartup.cs:54
  at Banshee.Gui.GtkBaseClient.Startup<T> () [0x00049] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:74
  at Banshee.Gui.GtkBaseClient.Startup<T> (string[]) [0x00021] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Core/Banshee.ThickClient/Banshee.Gui/GtkBaseClient.cs:64
  at Nereid.Client.Main (string[]) [0x00002] in /home/knocte/Documents/Code/merging/mono323/bansheeMASTER/src/Clients/Nereid/Nereid/Client.cs:54
  at (wrapper runtime-invoke) <Module>.runtime_invoke_void_object (object,intptr,intptr,intptr) <IL 0x00050, 0xffffffff>

Native stacktrace:

	/opt/mono/bin/mono() [0x4ab821]
	/opt/mono/bin/mono() [0x5029ff]
	/opt/mono/bin/mono() [0x420837]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0xfbd0) [0x2aaaab3f0bd0]
	/opt/mono/bin/mono() [0x5efa8e]
	/opt/mono/bin/mono() [0x5cbd87]
	/opt/mono/bin/mono() [0x5d16e6]
	/opt/mono/bin/mono() [0x5d1ca8]
	/opt/mono/bin/mono() [0x5e801a]
	/opt/mono/bin/mono() [0x5e822b]
	[0x40431830]

Debug info from gdb:

Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No threads.

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

/bin/bash: line 1: 29836 Aborted                 (core dumped) /opt/mono/bin/mono --debug Nereid.exe mono --debug --uninstalled
make: *** [run] Error 134
Comment 1 Andres G. Aragoneses 2013-09-18 17:31:14 UTC
Running inside GDB:

[New Thread 0x7fffbf978700 (LWP 17333)]

Program received signal SIGSEGV, Segmentation fault.
0x00000000005efa8e in simple_nursery_serial_scan_object (start=0x7fffd50a4400 "\310M\263\001", queue=0x96bf60 <gray_queue>) at sgen-scan-object.h:71
71			SCAN;
Comment 2 Andres G. Aragoneses 2013-09-18 17:37:09 UTC
(gdb) thread apply all bt

Thread 10 (Thread 0x7fffbf978700 (LWP 17333)):
#0  0x00007ffff74ba071 in sem_timedwait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000000623c4b in mono_sem_timedwait (sem=sem@entry=0x96b268 <async_tp+40>, timeout_ms=timeout_ms@entry=2000, alertable=alertable@entry=1)
    at mono-semaphore.c:79
#2  0x0000000000588402 in async_invoke_thread (data=0x0, data@entry=0x96b240 <async_tp>) at threadpool.c:1565
#3  0x0000000000583ac2 in start_wrapper_internal (data=0x7fffcc0025f0) at threads.c:608
#4  start_wrapper (data=0x7fffcc0025f0) at threads.c:653
#5  0x0000000000618501 in thread_start_routine (args=args@entry=0x9c4ce8) at wthreads.c:294
#6  0x0000000000628450 in inner_start_thread (arg=0x7fffcc002660) at mono-threads-posix.c:49
#7  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#8  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 9 (Thread 0x7fffd540f700 (LWP 17331)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd540dd40, info=0x7fffc00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd540dd40) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b7ca2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000601a4b in _wapi_handle_timedwait_signal_handle (handle=0x400, timeout=timeout@entry=0x0, alertable=alertable@entry=1, poll=poll@entry=0)
    at handles.c:1588
#6  0x0000000000601ab5 in _wapi_handle_wait_signal (poll=poll@entry=0) at handles.c:1521
#7  0x0000000000616039 in WaitForMultipleObjectsEx (numobjects=numobjects@entry=2, handles=handles@entry=0x7fffd540e6e0, waitall=waitall@entry=0, 
---Type <return> to continue, or q <return> to quit---
    timeout=timeout@entry=4294967295, alertable=alertable@entry=1) at wait.c:668
#8  0x0000000000581c0d in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fb870, multiple=multiple@entry=1, numhandles=numhandles@entry=2, 
    handles=handles@entry=0x7fffd540e6e0, waitall=waitall@entry=0, ms=ms@entry=-1, alertable=1) at threads.c:1488
#9  0x00000000005834e0 in ves_icall_System_Threading_WaitHandle_WaitAny_internal (mono_handles=0x7ffff657c728, ms=-1, exitContext=<optimized out>)
    at threads.c:1586
#10 0x00000000401911b0 in ?? ()
#11 0x00007fffc0002540 in ?? ()
#12 0x00007fffd540ed18 in ?? ()
#13 0x00007ffff66b1d78 in ?? ()
#14 0xffffffffffffffff in ?? ()
#15 0x00007fffc0002630 in ?? ()
#16 0x00007fffd540e9c0 in ?? ()
#17 0x00007fffd540e920 in ?? ()
#18 0x00007fffd540ed18 in ?? ()
#19 0x00007ffff66b1d78 in ?? ()
#20 0xffffffffffffffff in ?? ()
#21 0x00007ffff657c728 in ?? ()
#22 0x00007ffff45afc1e in System.Threading.WaitHandle:WaitAny (waitHandles=0x0, timeout=-10000, exitContext=false) at <unknown>:228
#23 0x00007ffff45a98a1 in System.Threading.RegisteredWaitHandle:Wait (this=..., state=<optimized out>) at <unknown>:75
#24 0x0000000040013ba1 in ?? ()
#25 0x0000000000000005 in ?? ()
#26 0x0000000000000000 in ?? ()

---Type <return> to continue, or q <return> to quit---
Thread 8 (Thread 0x7fffd5450700 (LWP 17330)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd544f740, info=0x7fffcc0008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd544f740) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74bb43d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00000000006177e7 in SleepEx (ms=ms@entry=500, alertable=alertable@entry=1) at wthreads.c:842
#6  0x00000000005857f3 in monitor_thread (unused=unused@entry=0x0) at threadpool.c:779
#7  0x0000000000583ac2 in start_wrapper_internal (data=0x7fffd0003080) at threads.c:608
#8  start_wrapper (data=0x7fffd0003080) at threads.c:653
#9  0x0000000000618501 in thread_start_routine (args=args@entry=0x9c0698) at wthreads.c:294
#10 0x0000000000628450 in inner_start_thread (arg=0x7fffd0003110) at mono-threads-posix.c:49
#11 0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 7 (Thread 0x7fffd5e75700 (LWP 17329)):
#0  0x00007ffff74b805e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fffee747935 in g_cond_wait_until () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee6ddb81 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffee6de1ca in g_async_queue_timeout_pop () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007fffee72c6b2 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#7  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 6 (Thread 0x7fffd66aa700 (LWP 17328)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffd66a9140, info=0x7fffd00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffd66a9140) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b805e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x000000000060182d in _wapi_handle_timedwait_signal_handle (handle=handle@entry=0x442, timeout=timeout@entry=0x7fffd66a97d0, 
    alertable=alertable@entry=1, poll=poll@entry=0) at handles.c:1586
#6  0x0000000000615446 in WaitForSingleObjectEx (handle=0x442, timeout=timeout@entry=99, alertable=alertable@entry=1) at wait.c:198
#7  0x0000000000581c6f in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fbb30, multiple=multiple@entry=0, numhandles=numhandles@entry=1, 
    handles=handles@entry=0x7fffd66a98c8, waitall=waitall@entry=0, ms=ms@entry=99, alertable=1) at threads.c:1490
#8  0x00000000005833f9 in ves_icall_System_Threading_WaitHandle_WaitOne_internal (this=<optimized out>, handle=0x442, ms=99, exitContext=<optimized out>)
    at threads.c:1622
#9  0x00000000400b0518 in ?? ()
#10 0x00007fffd0002540 in ?? ()
#11 0x0000000000000038 in ?? ()
#12 0x00007fffd66a9987 in ?? ()
#13 0x00007ffff673a018 in ?? ()
#14 0x0000000000000038 in ?? ()
#15 0x00007fffd66a9990 in ?? ()
#16 0x00007fffd66a98f0 in ?? ()
---Type <return> to continue, or q <return> to quit---
#17 0x00000000013561e8 in ?? ()
#18 0x0000000000000063 in ?? ()
#19 0x00007ffff673a3b0 in ?? ()
#20 0x0000000000000063 in ?? ()
#21 0x00007ffff45b0081 in System.Threading.WaitHandle:WaitOne (this=..., millisecondsTimeout=0, exitContext=false) at <unknown>:382
#22 0x00007ffff45b010c in System.Threading.WaitHandle:WaitOne (this=..., millisecondsTimeout=99) from /opt/mono/lib/mono/4.5/mscorlib.dll.so
#23 0x00007ffff45af405 in System.Threading.Timer/Scheduler:SchedulerThread (this=...) at <unknown>:385
#24 0x00007ffff45abe49 in System.Threading.Thread:StartInternal (this=...) at <unknown>:682
#25 0x000000004001640f in ?? ()
#26 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7fffe453a700 (LWP 17327)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7fffe4539240, info=0x7fffdc0008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7fffe4539240) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b7ca2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000601a4b in _wapi_handle_timedwait_signal_handle (handle=handle@entry=0x40c, timeout=timeout@entry=0x0, alertable=alertable@entry=1, 
    poll=poll@entry=0) at handles.c:1588
#6  0x0000000000601a7b in _wapi_handle_wait_signal_handle (handle=handle@entry=0x40c, alertable=alertable@entry=1) at handles.c:1533
#7  0x000000000061555d in WaitForSingleObjectEx (handle=0x40c, timeout=timeout@entry=4294967295, alertable=alertable@entry=1) at wait.c:196
#8  0x0000000000581c6f in mono_wait_uninterrupted (thread=thread@entry=0x7ffff69fbc90, multiple=multiple@entry=0, numhandles=numhandles@entry=1, 
    handles=handles@entry=0x7fffe4539998, waitall=waitall@entry=0, ms=ms@entry=-1, alertable=1) at threads.c:1490
---Type <return> to continue, or q <return> to quit---
#9  0x00000000005833f9 in ves_icall_System_Threading_WaitHandle_WaitOne_internal (this=<optimized out>, handle=0x40c, ms=-1, exitContext=<optimized out>)
    at threads.c:1622
#10 0x00000000400b0518 in ?? ()
#11 0x00007fffdc002540 in ?? ()
#12 0x00007fffe4539a30 in ?? ()
#13 0x00007fffe4539a4f in ?? ()
#14 0x00007ffff6665ac8 in ?? ()
#15 0x00007fffdc0025d0 in ?? ()
#16 0x00007fffe4539a50 in ?? ()
#17 0x00007fffe45399c0 in ?? ()
#18 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7fffe520a700 (LWP 17326)):
#0  0x00007ffff71d13cd in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fffee7081dc in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee7086ba in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffeecda4f6 in ?? () from /usr/lib/x86_64-linux-gnu/libgio-2.0.so.0
#4  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7fffe5a0b700 (LWP 17325)):
#0  0x00007ffff71d13cd in poll () from /lib/x86_64-linux-gnu/libc.so.6
---Type <return> to continue, or q <return> to quit---
#1  0x00007fffee7081dc in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007fffee708304 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007fffe62aca1d in ?? () from /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
#4  0x00007fffee72beb5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5  0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 2 (Thread 0x7ffff4314700 (LWP 17324)):
#0  0x00007ffff711b3c3 in sigsuspend () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005c76a4 in suspend_thread (context=0x7ffff4313780, info=0x7ffff00008c0) at sgen-os-posix.c:111
#2  suspend_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7ffff4313780) at sgen-os-posix.c:130
#3  <signal handler called>
#4  0x00007ffff74b9f7e in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x0000000000623b68 in mono_sem_wait (sem=sem@entry=0x96b600 <finalizer_sem>, alertable=alertable@entry=1) at mono-semaphore.c:116
#6  0x000000000059fef5 in finalizer_thread (unused=unused@entry=0x0) at gc.c:1073
#7  0x0000000000583ac2 in start_wrapper_internal (data=0xa1cd70) at threads.c:608
#8  start_wrapper (data=0xa1cd70) at threads.c:653
#9  0x0000000000618501 in thread_start_routine (args=args@entry=0x9bd560) at wthreads.c:294
#10 0x0000000000628450 in inner_start_thread (arg=0xa1ca50) at mono-threads-posix.c:49
#11 0x00007ffff74b3f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007ffff71dde1d in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 1 (Thread 0x7ffff7fd97c0 (LWP 17320)):
---Type <return> to continue, or q <return> to quit---
#0  0x00000000005efa8e in simple_nursery_serial_scan_object (start=0x7fffd50a4400 "\310M\263\001", queue=0x96bf60 <gray_queue>) at sgen-scan-object.h:71
#1  0x00000000005cbd87 in sgen_drain_gray_stack (max_objs=max_objs@entry=-1, ctx=...) at sgen-gc.c:1192
#2  0x00000000005d16e6 in collect_nursery (finish_up_concurrent_mark=0, unpin_queue=0x0) at sgen-gc.c:2611
#3  collect_nursery (unpin_queue=0x0, finish_up_concurrent_mark=0) at sgen-gc.c:2483
#4  0x00000000005d1ca8 in sgen_perform_collection (requested_size=2944, generation_to_collect=0, reason=<optimized out>, wait_to_finish=<optimized out>)
    at sgen-gc.c:3445
#5  0x00000000005e7f54 in mono_gc_alloc_obj_nolock (vtable=vtable@entry=0xa079c8, size=size@entry=2944) at sgen-alloc.c:264
#6  0x00000000005e811b in mono_gc_alloc_string (vtable=0xa079c8, size=2944, len=1458) at sgen-alloc.c:563
#7  0x0000000040015b40 in ?? ()
#8  0x00007fffffffd8a0 in ?? ()
#9  0x00000000009bcf90 in ?? ()
#10 0x0000000000000516 in ?? ()
#11 0x447974403058bea8 in ?? ()
#12 0x00007ffff6800008 in ?? ()
#13 0x0000000000000b80 in ?? ()
#14 0x00007fffffffc8a0 in ?? ()
#15 0x00007ffff67ff488 in ?? ()
#16 0x00007ffff7fd9798 in ?? ()
#17 0x0000000000a079c8 in ?? ()
#18 0x00000000000005b2 in ?? ()
#19 0x0000000040014fa4 in ?? ()
#20 0x00007ffff69d8100 in ?? ()
#21 0x00007ffff67ff458 in ?? ()
---Type <return> to continue, or q <return> to quit---
#22 0x0000000000000000 in ?? ()
Comment 3 Andres G. Aragoneses 2013-09-19 20:01:16 UTC
> Mark Probst <mark@xamarin.com> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>                 CC|                            |mark@xamarin...

Hey Mark! do you reproduce this?

If not, I need some help, from the unmanaged stacktrace I guess the problem is in sgen-scan-object.h, but line 71 of that file is just a call to "SCAN", which two lines before seems to be defined as OBJ_BITMAP_FOREACH_PTR (desc, start).

As the last frame of the trace is said to be inside the function simple_nursery_serial_scan_object, I was expecting that the macro OBJ_BITMAP_FOREACH_PTR was calling this function, but it is not. This macro is defined in sgen-descriptor.h (line 174 or 195 depending if __GNUC__ is defined or not, which I guess in my case it is because I'm in Linux, 64bits, and not using clang or anything...).

I've grepped the name simple_nursery_serial_scan_object and it only comes up in sgen-minor-scan-object.h as the meaning of the SERIAL_SCAN_OBJECT macro, but HANDLE_PTR doesn't call this macro in this file, so I'm a bit lost.

Any hint you could give me? Thanks
Comment 4 Andres G. Aragoneses 2013-09-19 20:04:03 UTC
(Oh, forgot to say, OBJ_BITMAP_FOREACH_PTR uses HANDLE_PTR and OBJECT_HEADER_WORDS macros.) It seems that having so many macros around makes it really hard to track what happened from a stacktrace :(
Comment 5 Zoltan Varga 2013-09-19 20:55:12 UTC
Most sgen bugs show up as crashes in sgen_scan_object () , the problem is probably not in that function.
Comment 6 Andres G. Aragoneses 2013-09-20 07:27:57 UTC
(In reply to comment #5)
> Most sgen bugs show up as crashes in sgen_scan_object () , the problem is
> probably not in that function.

Ok, I've filled those macros with good old printfs, and I think I've located the place where it crashes, it is this line:

https://github.com/mono/mono/blob/master/mono/metadata/sgen-minor-scan-object.h#L94

Does it make sense? How can I further debug it?
Comment 7 Andres G. Aragoneses 2013-09-20 08:34:34 UTC
(In reply to comment #6)
> Ok, I've filled those macros with good old printfs, and I think I've located
> the place where it crashes, it is this line:
> https://github.com/mono/mono/blob/master/mono/metadata/sgen-minor-scan-object.h#L94
> Does it make sense? How can I further debug it?

I have added a check for NULL on ptr before doing that assignment, and it is not NULL. I have also tried adding an explicit (void*) cast, but it doesn't help.

So I have no idea what else could it be causing a SIGSEGV on that line. Maybe ptr is pointing to garbage (already deallocated memory)? How could I check this?

Any pointers? Thank you.
Comment 8 Zoltan Varga 2013-09-20 08:41:53 UTC
As I said, the problem is not in that function but somewhere else. GC bugs are pretty hard to track down and fix.
Comment 9 Andres G. Aragoneses 2013-09-20 09:53:51 UTC
(In reply to comment #8)
> As I said, the problem is not in that function but somewhere else. GC bugs are
> pretty hard to track down and fix.

Gotcha, that's why I'm asking for help to track it down :)

Anyway I think I did some progress, because I managed to find a workaround that works. I call it "workaround" instead of "fix", because there could be possibly a better fix than this, as what I'm basically doing is removing a piece of code which seems to be a __GNUC__ optimization. But if nobody manages to fix an optimization, it's better to have slower code than have buggy code, right? If you agree, then just merge my pull request :) 

https://github.com/mono/mono/pull/764

(For the record, this is the printf-diff that I used to debug this: https://gist.github.com/knocte/6637263 )
Comment 10 Zoltan Varga 2013-09-20 10:32:23 UTC
If this change really fixes the crash, the problem might be easier to track down that the usual GC bugs.
Could you add an 
print ("%d %lx\n", _index, (long int)_bmap); \
after both of the
			int _index = __builtin_ctz (_bmap); \
lines, and attach the output ?

Also, what architecture is this and what gcc version ? Does compiling the runtime with -O0 fixes this issue  ?
Comment 11 Rodrigo Kumpera 2013-09-20 11:02:46 UTC
It might really be the case that btz generates bad code on 64bits.
Comment 12 Andres G. Aragoneses 2013-09-20 12:20:17 UTC
Created attachment 4943 [details]
output when running with some g_print patch from zoltan

(In reply to comment #10)
> If this change really fixes the crash, the problem might be easier to track
> down that the usual GC bugs.

Great!


> Could you add an ... and attach the output?

Sure, here you have the output. (The last lines prints a negative index, which smells like the culprit.)


> Also, what architecture is this and what gcc version ?

Linux PC 64bits (ubuntu 13.04), gcc (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3


> Does compiling the runtime with -O0 fixes this issue  ?

configured via CFLAGS=-O0 ./autogen.sh --prefix=/opt/mono and seems to fix it as well.
Comment 13 Andres G. Aragoneses 2013-09-20 12:24:35 UTC
BTW I'm renaming the summary given that we have more or less found the culprit.

These 2 commits I guess are also very related to this:

https://github.com/knocte/mono/commit/18fe1394470136e7f6ac7ed0728e8c2976221657

https://github.com/knocte/mono/commit/a698aba8f8ec599b848b4695c06c6f2ec7eb301b
Comment 14 Mark Probst 2013-09-20 12:51:42 UTC
We don't handle the case where _bmap is 0.  GCC's documentation says about __builtin_ctz: If x is 0, the result is undefined.  It seems the unrolled first iteration of the loop must be removed.  Could you try that and see whether it works?
Comment 15 Zoltan Varga 2013-09-20 13:07:50 UTC
Could you try this:
https://github.com/mono/mono/commit/d2cc22580898df5d4a15e0f99ab513e1570a6082
Comment 16 Andres G. Aragoneses 2013-09-20 16:19:16 UTC
(In reply to comment #10)
> Could you try this: ...

Thanks Zoltan, that worked! So I'm closing this as FIXED.

BTW, I also just proposed a pull request to improve things slightly (avoiding some redundancy): https://github.com/mono/mono/pull/765

Note You need to log in before you can comment on or make changes to this bug.