Bug 39861 - Segmentation fault when using multiple threads and doing some work.
Summary: Segmentation fault when using multiple threads and doing some work.
Status: NEW
Alias: None
Product: Runtime
Classification: Mono
Component: General (show other bugs)
Version: 4.2.0 (C6)
Hardware: Other Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2016-03-23 09:04 UTC by Vercruyssen Bjorn
Modified: 2016-04-12 13:47 UTC (History)
3 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
Sample code (22.50 KB, application/x-zip-compressed)
2016-03-23 09:04 UTC, Vercruyssen Bjorn
Details
Core dump (2.46 MB, application/x-zip-compressed)
2016-03-23 09:06 UTC, Vercruyssen Bjorn
Details

Description Vercruyssen Bjorn 2016-03-23 09:04:58 UTC
Created attachment 15491 [details]
Sample code

Context 
We have an application running on mono on a ARM based device, once in a blue moon the application crashes with an illegal instruction or a segmentation fault (depending on the mono version used)

What have I already done I have created a really simple application that shows the same behaviour, trying to rule out our own application, this is a contrived sample but it triggers the fault the fastest usually within a minute or 2 - 3

  class Program
    {
        static void Main(string[] args)
        {
            Task.Run(() =>
            {
                var counter = 0;
                while (true)
                {
                    Print();
                    if (counter++%1000 == 0)
                    {
                        Console.WriteLine("Ping from thread: {0}",Thread.CurrentThread.ManagedThreadId);
                        counter = 1;
                    }
                }
            });

            Task.Run(() =>
            {
                var counter = 0; 
                while (true)
                {
                    Print();
                    if (counter++ % 1000 == 0)
                    {
                        Console.WriteLine("Ping from thread: {0}", Thread.CurrentThread.ManagedThreadId);
                        counter = 1;
                    }
                }
            });

            Console.ReadLine();
        }

        private static void Print()
        {

           // Console.WriteLine("msg from: {0}", Thread.CurrentThread.ManagedThreadId);
            var result = (AppSettingsSection)ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.None).GetSection("appSettings");
        }


This will throw

an illegal instruction on mono version 4.0.3.13
a segmentation fault on mono version: 4.2.1.60
a segmentation fault or hang on mono version 4.2.3.4/832de4b

Running in GDB gives the following information:

mono 4.0.3.13 Program received signal SIGILL, Illegal instruction. [Switching to Thread 0x752ff430 (LWP 15107)] 0x0016a712 in verify_scan_starts (start=0x1dd19e "object", end=0x1 ) at sgen-gc.c:2116 2116 SGEN_LOG (1, "NFC-BAD SCAN START [%zu] %p for obj [%p %p]", i, addr, start, end);

or

Program received signal SIGILL, Illegal instruction. [Switching to Thread 0x754ff430 (LWP 4191)] 0x0016a712 in verify_scan_starts (start=0xffffffff , end=0x1 ) at sgen-gc.c:2116 2116 SGEN_LOG (1, "NFC-BAD SCAN START [%zu] %p for obj [%p %p]", i, addr, start, end);

mono 4.2.1 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x746ff430 (LWP 4493)] 0x00168008 in restart_handler (_dummy=, _info=, context=) at sgen-os-posix.c:167 167 errno = old_errno; (gdb) bt `#0 0x00168008 in restart_handler (_dummy=, _info=, context=) at sgen-os-posix.c:167

1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) mono_trace Undefined command: "mono_trace". Try "help". (gdb) mono_backtrace Missing argument 0 in user function. (gdb) mono_stack

"Threadpool worker" tid=0x0x746ff430 this=0x0x75670300 thread handle 0x40d state : not waiting owns () at <0xffffffff> at (wrapper managed-to-native) System.Type.internal_from_name (string,bool,bool) at System.Type.GetType (string) [0x00011] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/mcs /class/corlib/ReferenceSources/Type.cs:49 at System.Configuration.InternalConfigurationHost.GetConfigType (string,bool) [0x00000] in /home/jdi/develop/yocto/build/tmp/work/x86_6 4-linux/mono-native/4.2.1.60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/InternalConfigurationHost.cs:71 at System.Configuration.ConfigInfo.CreateInstance () [0x00011] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1 .60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/ConfigInfo.cs:50 at System.Configuration.SectionInfo.CreateInstance () [0x00000] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2. 1.60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/SectionInfo.cs:63 at System.Configuration.Configuration.GetSectionInstance (System.Configuration.SectionInfo,bool) [0x00022] in /home/jdi/develop/yocto/b uild/tmp/work/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/Configuration.cs:281 at System.Configuration.ConfigurationSectionCollection.get_Item (string) [0x00032] in /home/jdi/develop/yocto/build/tmp/work/x86_64-lin ux/mono-native/4.2.1.60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/ConfigurationSectionCollection.cs:68 at System.Configuration.Configuration.GetSection (string) [0x0001b] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/ 4.2.1.60-r0/mono-4.2.1/mcs/class/System.Configuration/System.Configuration/Configuration.cs:254 at iltest.Program.Print () [0x00020] in e:\Libraries\Documents\Visual Studio 2012\Projects\iltest\Program.cs:60 at iltest.Program.b__1 () [0x00004] in e:\Libraries\Documents\Visual Studio 2012\Projects\iltest\Program.cs:34 at System.Threading.Tasks.Task1<T_REF>.InnerInvoke () [0x00012] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2                                 .1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Future.cs:686
  at System.Threading.Tasks.Task.Execute () [0x00016] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.60-r0/mono                                 -4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2523
  at System.Threading.Tasks.Task.ExecutionContextCallback (object) [0x00007] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-                                 native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2887
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00                                 081] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/t                                 hreading/executioncontext.cs:581
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00000] in                                  /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading                                 /executioncontext.cs:530
  at System.Threading.Tasks.Task.ExecuteWithThreadLocal (System.Threading.Tasks.Task&) [0x0004c] in /home/jdi/develop/yocto/build/tmp/wor                                 k/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2844
  at System.Threading.Tasks.Task.ExecuteEntry (bool) [0x0006f] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.6                                 0-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2781
  at System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00000] in /home/jdi/develop/yocto/build/tmp/w                                 ork/x86_64-linux/mono-native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/Tasks/Task.cs:2728
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00096] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mono-native/4.2.1.                                 60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/threadpool.cs:859
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in /home/jdi/develop/yocto/build/tmp/work/x86_64-linux/mon                                 o-native/4.2.1.60-r0/mono-4.2.1/external/referencesource/mscorlib/system/threading/threadpool.cs:1196
  at (wrapper runtime-invoke) <Module>.runtime_invoke_bool (object,intptr,intptr,intptr) <IL 0x00060, 0xffffffff>

In attachment you will find the sample code

Latest used version and settings
Mono JIT compiler version 4.2.3 (Stable 4.2.3.4/832de4b Mon Mar 21 09:03:22 CET 2016)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       normal
        Notifications: epoll
        Architecture:  armel,vfp
        Disabled:      none
        Misc:          smallconfig softdebug
        LLVM:          supported, not enabled.
        GC:            sgen
Comment 1 Vercruyssen Bjorn 2016-03-23 09:06:08 UTC
Created attachment 15492 [details]
Core dump

Core dump of sample code using 

Mono JIT compiler version 4.2.3 (Stable 4.2.3.4/832de4b Mon Mar 21 09:03:22 CET 2016)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       normal
        Notifications: epoll
        Architecture:  armel,vfp
        Disabled:      none
        Misc:          smallconfig softdebug
        LLVM:          supported, not enabled.
        GC:            sgen
Comment 2 Vercruyssen Bjorn 2016-03-24 10:42:27 UTC
We just now tested the same sample application using the boehm and this is now running for several hours without faults. So problem seems to be related to the s-gen garbage collector.
Comment 3 Andi McClure 2016-04-08 15:05:30 UTC
Hello, I tested on our hard-float-32-bit ARM machine with 4.2.3.4/832de4b+sgen and I was not able to reproduce the crash with your sample. Unfortunately there is little we can do on this if we cannot reproduce at our end.

Here are some things that might help, however:

- How much RAM is available on the machine?
- Are you building mono yourself? Are you invoking with any unusual options?
- Try running the test with the environment variable MONO_DEBUG=suspend-on-sigsegv . When the segfault happens, the program will freeze and print a message, and you will have the opportunity to attach gdb. Please attach gdb to a crashed mono, run
    t apply all bt
and give us back the results.
Comment 4 Vercruyssen Bjorn 2016-04-12 12:15:00 UTC
Hello, 
is the core dump I attached initially of no use? 

This is the result with the suspend flag and t apply all bt

(gdb) t apply all bt

Thread 6 (Thread 0x767ff430 (LWP 3092)):
#0  0x496ab6c4 in __pthread_cond_wait (cond=0x288cd0, mutex=0x288cb8) at pthread_cond_wait.c:186
#1  0x00192de0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 5 (Thread 0x7577b430 (LWP 3093)):
#0  0x496ad670 in futex_abstimed_wait (cancel=true, private=0, abstime=0x0, expected=1, futex=0x27de1c) at sem_waitcommon.c:42
#1  do_futex_wait (sem=sem@entry=0x27de1c, abstime=0x0) at sem_waitcommon.c:211
#2  0x496ad768 in __new_sem_wait_slow (sem=0x27de1c, abstime=0x0) at sem_waitcommon.c:392
#3  0x001b1fc8 in mono_sem_wait ()
Cannot access memory at address 0x0
#4  0x00148d94 in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 4 (Thread 0x76c34430 (LWP 3094)):
#0  0x496ae378 in __lll_lock_wait (futex=futex@entry=0x2899d8, private=0) at lowlevellock.c:46
#1  0x496a8520 in __GI___pthread_mutex_lock (Cannot access memory at address 0x0
mutex=0x2899d8) at pthread_mutex_lock.c:80
#2  0x0016cb90 in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 3 (Thread 0x754ff430 (LWP 3095)):
#0  0x00168c48 in ?? ()
#1  0x00168a22 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0x751ff430 (LWP 3096)):
#0  0x496ad670 in futex_abstimed_wait (cancel=true, private=0, abstime=0x0, expected=1, futex=0x27e184) at sem_waitcommon.c:42
#1  do_futex_wait (sem=sem@entry=0x27e184, abstime=0x0) at sem_waitcommon.c:211
#2  0x496ad768 in __new_sem_wait_slow (sem=0x27e184, abstime=0x0) at sem_waitcommon.c:392
#3  0x001b1fc8 in mono_sem_wait ()
#4  0x00168be8 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0x76fea000 (LWP 3091)):
#0  0x495aac74 in do_sigsuspend (set=0x27e104) at ../sysdeps/unix/sysv/linux/sigsuspend.c:29
#1  __GI___sigsuspend (set=0x27e104) at ../sysdeps/unix/sysv/linux/sigsuspend.c:42
#2  0x00168b48 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Our hardware has 
512MB of RAM where about 250 to 300MB is available
An imx6dl with hard floating point unit

We are building mono ourselves using Yocto, I will check the options and provide them as well 

Thx in advance
Comment 5 Vercruyssen Bjorn 2016-04-12 13:47:25 UTC
Compile options:

arm-poky-linux-gnueabi-gcc  -march=armv7-a -mfloat-abi=softfp -mfpu=neon

mono/4.0.3.13-r0/mono-4.0.3/configure  --build=x86_64-linux 		  --host=arm-poky-linux-gnueabi 		  --target=arm-poky-linux-gnueabi 		  --prefix=/usr 		  --exec_prefix=/usr 		  --bindir=/usr/bin 		  --sbindir=/usr/sbin 		  --libexecdir=/usr/lib/mono 		  --datadir=/usr/share 		  --sysconfdir=/etc 		  --sharedstatedir=/com 		  --localstatedir=/var 		  --libdir=/usr/lib 		  --includedir=/usr/include 		  --oldincludedir=/usr/include 		  --infodir=/usr/share/info 		  --mandir=/usr/share/man 		  --disable-silent-rules 		  --disable-dependency-tracking 		  --with-libtool-sysroot=/home/jdi/develop/yocto/build/tmp/sysroots/emperor   mono_cv_uscore=no --with-sigaltstack=no --with-mcs-docs=no   --disable-mcs-build mono_cv_clang=no  --without-x --without-moonlight --without-libgdiplus --with-profile2=no --with-profile4=no --with-profile4_5=yes --enable-small-config

Note You need to log in before you can comment on or make changes to this bug.