Bug 52604

Summary: Runtime hangs on exit on mac
Product: [Mono] Runtime Reporter: Marek Safar <masafa>
Component: GeneralAssignee: Ludovic Henry <ludovic>
Status: CONFIRMED ---    
Severity: normal CC: abock, alan, chris.hamons, kumpera, mono-bugs+mono, mono-bugs+runtime, rolf, rui.marinho
Priority: ---    
Version: master   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS   
Tags: Is this bug a regression?: ---
Last known good build:

Description Marek Safar 2017-02-17 14:05:06 UTC
Doing XI make exec-mac-unified-dontlink in tests

hangs with follwing all the time

(lldb) bt all
* thread #1: tid = 0x145f454, 0x00007fff9838ff72 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_307', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff9838ff72 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x00007fff9838f3b3 libsystem_kernel.dylib`mach_msg + 55
    frame #2: 0x00007fff9838c0dc libsystem_kernel.dylib`thread_get_state + 146
    frame #3: 0x0000000101f8a516 dontlink-mac`mono_thread_state_init_from_handle(tctx=<unavailable>, info=<unavailable>) + 198 at mini-darwin.c:122 [opt]
    frame #4: 0x0000000102137f6f dontlink-mac`mono_threads_suspend_begin_async_suspend(info=<unavailable>, interrupt_kernel=<unavailable>) + 79 at mono-threads-mach.c:95 [opt]
    frame #5: 0x000000010213b6e4 dontlink-mac`mono_thread_info_safe_suspend_and_run [inlined] begin_async_suspend(info=0x00007fd63b88ba00, interrupt_kernel=1) + 356 at mono-threads.c:125 [opt]
    frame #6: 0x000000010213b6d9 dontlink-mac`mono_thread_info_safe_suspend_and_run [inlined] suspend_sync(interrupt_kernel=1) + 201 at mono-threads.c:900 [opt]
    frame #7: 0x000000010213b610 dontlink-mac`mono_thread_info_safe_suspend_and_run [inlined] suspend_sync_nolock at mono-threads.c:932 [opt]
    frame #8: 0x000000010213b610 dontlink-mac`mono_thread_info_safe_suspend_and_run(id=0x0000700000826000, interrupt_kernel=1, callback=<unavailable>, user_data=0x00007fff5dd78598) + 144 at mono-threads.c:975 [opt]
    frame #9: 0x00000001020b3640 dontlink-mac`async_suspend_internal(thread=0x0000000105338508, interrupt=1) + 80 at threads.c:4976 [opt]
    frame #10: 0x00000001020b3447 dontlink-mac`mono_thread_suspend_all_other_threads + 327 at threads.c:3459 [opt]
    frame #11: 0x0000000102009d80 dontlink-mac`ves_icall_System_Environment_Exit(result=0) + 32 at icall.c:6647 [opt]
    frame #12: 0x000000010b2e86c9
    frame #13: 0x000000010b2e83ef
    frame #14: 0x000000010b2e839a
    frame #15: 0x0000000101f9f5e8 dontlink-mac`mono_jit_runtime_invoke(method=<unavailable>, obj=<unavailable>, params=0x00007fff5dd790c0, exc=<unavailable>, error=<unavailable>) + 2264 at mini-runtime.c:2530 [opt]
    frame #16: 0x000000010206a008 dontlink-mac`do_runtime_invoke(method=0x00007fd63c0e49f8, obj=0x0000000105486540, params=0x00007fff5dd790c0, exc=0x0000000000000000, error=0x00007fff5dd79008) + 88 at object.c:2858 [opt]
    frame #17: 0x0000000102069f80 dontlink-mac`mono_runtime_invoke [inlined] mono_runtime_invoke_checked(method=<unavailable>, obj=0x0000000105486540, params=0x00007fff5dd790c0, error=0x00007fff5dd79008) + 63 at object.c:3016 [opt]
    frame #18: 0x0000000102069f41 dontlink-mac`mono_runtime_invoke(method=0x00007fd63c0e49f8, obj=0x0000000105486540, params=0x00007fff5dd790c0, exc=<unavailable>) + 145 at object.c:2915 [opt]
    frame #19: 0x0000000102168dfd dontlink-mac`native_to_managed_trampoline_11(self=0x00007fd63ae79e10, _cmd="xamarinApplySelector", managed_method_ptr=0x000000010235b978, token_ref=3389) + 237 at Xamarin.Mac.registrar.mobile.x86_64.m:474
    frame #20: 0x0000000102168d04 dontlink-mac`::-[__MonoMac_NSActionDispatcher xamarinApplySelector](self=0x00007fd63ae79e10, _cmd="xamarinApplySelector") + 52 at Xamarin.Mac.registrar.mobile.x86_64.m:29143
    frame #21: 0x00007fff9920efde Foundation`__NSThreadPerformPerform + 279
    frame #22: 0x00007fff8ad197e1 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
    frame #23: 0x00007fff8acf8f1c CoreFoundation`__CFRunLoopDoSources0 + 556
    frame #24: 0x00007fff8acf843f CoreFoundation`__CFRunLoopRun + 927
    frame #25: 0x00007fff8acf7e38 CoreFoundation`CFRunLoopRunSpecific + 296
    frame #26: 0x00007fff94d95935 HIToolbox`RunCurrentEventLoopInMode + 235
    frame #27: 0x00007fff94d9576f HIToolbox`ReceiveNextEventCommon + 432
    frame #28: 0x00007fff94d955af HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 71
    frame #29: 0x00007fff87378df6 AppKit`_DPSNextEvent + 1067
    frame #30: 0x00007fff87378226 AppKit`-[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 454
    frame #31: 0x00007fff8736cd80 AppKit`-[NSApplication run] + 682
    frame #32: 0x0000000101ea55c9 dontlink-mac`xamarin_dyn_objc_msgSend + 217
    frame #33: 0x000000010b2e6f06
    frame #34: 0x0000000109e6211d
    frame #35: 0x0000000109e6064f
    frame #36: 0x0000000101f9f5e8 dontlink-mac`mono_jit_runtime_invoke(method=<unavailable>, obj=<unavailable>, params=0x00007fff5dd7b1e0, exc=<unavailable>, error=<unavailable>) + 2264 at mini-runtime.c:2530 [opt]
    frame #37: 0x000000010206a008 dontlink-mac`do_runtime_invoke(method=0x00007fd63ad06958, obj=0x0000000000000000, params=0x00007fff5dd7b1e0, exc=0x0000000000000000, error=0x00007fff5dd7b230) + 88 at object.c:2858 [opt]
    frame #38: 0x000000010206d520 dontlink-mac`do_exec_main_checked [inlined] mono_runtime_invoke_checked(method=<unavailable>, obj=<unavailable>, error=0x00007fff5dd7b230) + 144 at object.c:3016 [opt]
    frame #39: 0x000000010206d4df dontlink-mac`do_exec_main_checked(method=0x00007fd63ad06958, args=<unavailable>, error=0x00007fff5dd7b230) + 79 at object.c:4678 [opt]
    frame #40: 0x0000000101f00c5a dontlink-mac`mono_jit_exec(domain=<unavailable>, assembly=<unavailable>, argc=1, argv=0x00007fd63d200310) + 298 at driver.g.c:1037 [opt]
    frame #41: 0x0000000101f032ff dontlink-mac`mono_main [inlined] main_thread_handler + 8799 at driver.g.c:1106 [opt]
    frame #42: 0x0000000101f032cc dontlink-mac`mono_main(argc=<unavailable>, argv=<unavailable>) + 8748 at driver.g.c:2215 [opt]
    frame #43: 0x0000000101ea5d96 dontlink-mac`::xamarin_main(argc=1, argv=0x00007fff5dd7b568, is_extension=false) + 1046 at launcher.m:612
    frame #44: 0x0000000101ea6aa4 dontlink-mac`main(argc=1, argv=0x00007fff5dd7b568) + 36 at launcher.m:623
    frame #45: 0x0000000101e85f14 dontlink-mac`start + 52

  thread #2: tid = 0x145f457, 0x00007fff98396efa libsystem_kernel.dylib`kevent_qos + 10, queue = 'com.apple.libdispatch-manager'
    frame #0: 0x00007fff98396efa libsystem_kernel.dylib`kevent_qos + 10
    frame #1: 0x00007fff96498165 libdispatch.dylib`_dispatch_mgr_invoke + 216
    frame #2: 0x00007fff96497dcd libdispatch.dylib`_dispatch_mgr_thread + 52

  thread #3: tid = 0x145f462, 0x00007fff98395db6 libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'SGen worker'
    frame #0: 0x00007fff98395db6 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff9be9b728 libsystem_pthread.dylib`_pthread_cond_wait + 767
    frame #2: 0x000000010212437e dontlink-mac`thread_func [inlined] mono_os_cond_wait(mutex=0x0000000102359d38) + 15 at mono-os-mutex.h:146 [opt]
    frame #3: 0x000000010212436f dontlink-mac`thread_func(thread_data=<unavailable>) + 687 at sgen-thread-pool.c:129 [opt]
    frame #4: 0x00007fff9be9a99d libsystem_pthread.dylib`_pthread_body + 131
    frame #5: 0x00007fff9be9a91a libsystem_pthread.dylib`_pthread_start + 168
    frame #6: 0x00007fff9be98351 libsystem_pthread.dylib`thread_start + 13

  thread #4: tid = 0x145f473, 0x00007fff9838ffae libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'Finalizer'
    frame #0: 0x00007fff9838ffae libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x0000000102004f71 dontlink-mac`finalizer_thread [inlined] mono_os_sem_wait(flags=MONO_SEM_FLAGS_ALERTABLE) + 11 at mono-os-semaphore.h:90 [opt]
    frame #2: 0x0000000102004f66 dontlink-mac`finalizer_thread + 8 at mono-coop-semaphore.h:40 [opt]
    frame #3: 0x0000000102004f5e dontlink-mac`finalizer_thread(unused=<unavailable>) + 174 at gc.c:907 [opt]
    frame #4: 0x00000001020b57e3 dontlink-mac`start_wrapper [inlined] start_wrapper_internal + 301 at threads.c:836 [opt]
    frame #5: 0x00000001020b56b6 dontlink-mac`start_wrapper(data=<unavailable>) + 38 at threads.c:888 [opt]
    frame #6: 0x000000010213c10d dontlink-mac`inner_start_thread(data=<unavailable>) + 189 at mono-threads.c:1165 [opt]
    frame #7: 0x00007fff9be9a99d libsystem_pthread.dylib`_pthread_body + 131
    frame #8: 0x00007fff9be9a91a libsystem_pthread.dylib`_pthread_start + 168
    frame #9: 0x00007fff9be98351 libsystem_pthread.dylib`thread_start + 13

  thread #5: tid = 0x145f4c1, 0x00007fff98395db6 libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'Threadpool worker'
    frame #0: 0x00007fff98395db6 libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff9be9b728 libsystem_pthread.dylib`_pthread_cond_wait + 767
    frame #2: 0x00007fff991cdfc8 Foundation`-[NSCondition wait] + 240
    frame #3: 0x00007fff991ccf9a Foundation`-[NSObject(NSThreadPerformAdditions) performSelector:onThread:withObject:waitUntilDone:modes:] + 935
    frame #4: 0x00007fff991ccb75 Foundation`-[NSObject(NSThreadPerformAdditions) performSelectorOnMainThread:withObject:waitUntilDone:] + 131
    frame #5: 0x0000000101ea55c9 dontlink-mac`xamarin_dyn_objc_msgSend + 217

  thread #6: tid = 0x145f4f6, 0x00007fff9838ff72 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.NSEventThread'
    frame #0: 0x00007fff9838ff72 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x00007fff9838f3b3 libsystem_kernel.dylib`mach_msg + 55
    frame #2: 0x00007fff8acf9124 CoreFoundation`__CFRunLoopServiceMachPort + 212
    frame #3: 0x00007fff8acf85ec CoreFoundation`__CFRunLoopRun + 1356
    frame #4: 0x00007fff8acf7e38 CoreFoundation`CFRunLoopRunSpecific + 296
    frame #5: 0x00007fff874ced95 AppKit`_NSEventThread + 149
    frame #6: 0x00007fff9be9a99d libsystem_pthread.dylib`_pthread_body + 131
    frame #7: 0x00007fff9be9a91a libsystem_pthread.dylib`_pthread_start + 168
    frame #8: 0x00007fff9be98351 libsystem_pthread.dylib`thread_start + 13
Comment 1 Rodrigo Kumpera 2017-02-17 20:00:37 UTC
Hey Ludovic,

This is hanging in thread suspend
Comment 2 Bernhard Urban 2017-02-27 21:07:25 UTC
*** Bug 52551 has been marked as a duplicate of this bug. ***
Comment 3 Ludovic Henry 2017-02-27 23:29:33 UTC
Thread 5 managed stacktrace is:

> "Threadpool worker" tid=0x0x70000fe8d000 this=0x0x109a64508 , thread handle : 0x7f94e6c9aff0, state : interrupted state
>   at <unknown> <0xffffffff>
>   at (wrapper managed-to-native) ObjCRuntime.Messaging.void_objc_msgSend_IntPtr_IntPtr_bool (intptr,intptr,intptr,intptr,bool) [0x00016] in <342165ad97f247a3ac2ce807f5f689ca>:0
>   at Foundation.NSObject.InvokeOnMainThread (System.Action) [0x00025] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/src/Foundation/NSObject2.cs:557
>   at Xamarin.Mac.Tests.MainClass/NSRunLoopIntegration.InvokeOnMainLoop (GuiUnit.InvokerHelper) [0x00012] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/tests/common/mac/MacTestMain.cs:68
>   at GuiUnit.TestRunner.Shutdown () [0x00032] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:280
>   at GuiUnit.TestRunner/<ExecuteWithListener>c__AnonStorey0.<>m__0 (object) [0x0002d] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:231
>   at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (object) [0x0000d] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1306
>   at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00071] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/executioncontext.cs:957
>   at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00000] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/executioncontext.cs:904
>   at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00021] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1283
>   at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00074] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:856
>   at ObjCRuntime.Runtime.ThreadPoolDispatcher (System.Func`1<bool>) [0x00008] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/src/ObjCRuntime/Runtime.cs:229
>   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00009] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/mono/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1208
>   at (wrapper runtime-invoke) <Module>.runtime_invoke_bool (object,intptr,intptr,intptr) [0x0001f] in <81dae2323deb4264851c8799c07afd08>:0

Thread 1 managed stacktrace is:

> "<unnamed thread>" tid=0x0x7fffc68213c0 this=0x0x109a64130 , thread handle : 0x7f94e6e0c6d0, state : not waiting
>   at <unknown> <0xffffffff>
>   at (wrapper managed-to-native) System.Environment.Exit (int) [0x00007] in <81dae2323deb4264851c8799c07afd08>:0
>   at Xamarin.Mac.Tests.MainClass/NSRunLoopIntegration.Shutdown () [0x00006] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/tests/common/mac/MacTestMain.cs:73
>   at GuiUnit.TestRunner.<Shutdown>m__1 () [0x0003b] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:275
>   at GuiUnit.InvokerHelper.Invoke () [0x00014] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/InvokerHelper.cs:18
>   at Foundation.NSActionDispatcher.Apply () [0x00007] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/src/Foundation/NSAction.cs:57
>   at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) [0x0004f] in <81dae2323deb4264851c8799c07afd08>:0
>   at <unknown> <0xffffffff>
>   at (wrapper managed-to-native) ObjCRuntime.Messaging.void_objc_msgSend (intptr,intptr) [0x0000a] in <342165ad97f247a3ac2ce807f5f689ca>:0
>   at AppKit.NSApplication.Run () [0x0001d] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/src/build/mac/mobile/AppKit/NSApplication.g.cs:2126
>   at Xamarin.Mac.Tests.MainClass/NSRunLoopIntegration.RunMainLoop () [0x00006] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/tests/common/mac/MacTestMain.cs:63
>   at GuiUnit.TestRunner.ExecuteWithListener (string[],NUnitLite.Runner.TcpWriter) [0x00357] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:234
>   at GuiUnit.TestRunner.Execute (string[]) [0x0008f] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:137
>   at GuiUnit.TestRunner.Main (string[]) [0x00007] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/external/guiunit/src/framework/GuiUnit/TestRunner.cs:71
>   at Xamarin.Mac.Tests.MainClass.RunTests () [0x0006b] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/tests/common/mac/MacTestMain.cs:43
>   at Xamarin.Mac.Tests.MainClass.Main (string[]) [0x00006] in /Users/ludovic/Xamarin/xamarin-macios/xamarin-macios/tests/common/mac/MacTestMain.cs:27
>   at (wrapper runtime-invoke) <Module>.runtime_invoke_void_object (object,intptr,intptr,intptr) [0x00051] in <68b80ee4c802488a9b90c6c32fa3bbae>:0

What is happening is the following:
 - thread 5 calls Foundation.NSObject.InvokeOnMainThread
 - thread 1 calls Environment.Exit, and suspend all threads
  - thread 1 tries to suspend thread 5

This is a livelock between thread 1 and thread 5: Foundation.NSObject.InvokeOnMainThread doesn't return until thread 1 has executed the delegate to execute, and thread 5 doesn't get suspended until it returns from Foundation.NSObject.InvokeOnMainThread

The behaviour hasn't changed on Mono, and I am confused how that could have ever worked if it never changed.

@Chris, could you confirm that nothing has changed on XM side?
@Kumpera, I am not aware of any change on Mono side, I have search git log but didn't find anything. I am certainly missing something, so would you mind having a look at it?

To reproduce, simply compile XI (it crashes with XS 6.3 with error `System.UnauthorizedAccessException: Access to the path "/x86"`, nothing to do with this bug, but good to know, so use XS 6.1), and run `make -C tests run-mac-unified-dontlink`
Comment 5 Ludovic Henry 2017-03-01 19:09:12 UTC
After comparing behaviour with macios-mono-master (mono 4.8), the problem became quite obvious.

Here is what happens with mono 4.8:
 - thread 1 calls `Environment.Exit`:
   - calls `mono_runtime_try_shutdown`
     - calls `mono_threadpool_ms_cleanup`
       - calls `mono_thread_internal_stop` on each threadpool thread
         - sets on thread 5 `thread->state |= ThreadState_StopRequested`
   - calls `mono_thread_suspend_all_other_threads`
     - discards thread 5 because `(thread->state & ThreadState_StopRequested) != 0`

What doesn't happen on mono 2017-02 is the call to `mono_thread_internal_stop` on each threadpool thread, so in `mono_thread_suspend_all_other_threads` we end up not discarding thread 5, and we indefinitely wait on it.

The quick fix is to use `NSApplication.SharedApplication.Terminate` in place of `Environment.Exit` in the TestRunner in macios.

A longer term fix is to fix `Environment.Exit` so it doesn't hang on shutdown.

After discussing the matter with @Rolf and @Chris, it was decided to use the quick fix for now, fill up a new bug with a simpler repro, and have me work on the longer term fix.
Comment 8 Aaron Bockover [MSFT] 2017-05-26 15:51:15 UTC
Here is a standalone test case. Drop the AppDelegate.cs file in place of the default one from a Xamarin.Mac application template and run:

https://gist.github.com/abock/2c39c1c21ffabe9252bd84c89f03ec2a
Comment 9 Aaron Bockover [MSFT] 2017-05-26 15:52:58 UTC
Reopening since I have provided a simple repro case. With this bug, Environment.Exit hangs, libc's exit(3) is never called, and thus atexit(3) never calls our handler.
Comment 10 Aaron Bockover [MSFT] 2017-06-07 19:43:17 UTC
I suspect this bug is also the underlying reason for https://bugzilla.xamarin.com/show_bug.cgi?id=57223
Comment 11 Alan McGovern 2017-06-26 15:34:18 UTC
I believe this is a bug in NSRunLoopIntegration. it should be implemented as follows:

https://github.com/mono/guiunit/blob/master/src/framework/GuiUnit/MonoMacMainLoopIntegration.cs#L58-L60

We must exit the application using NSApplication.Terminate otherwise there are no guaratees the application will actually exit.