This is an assertion we're randomly hitting in our tests, so there is no repro (for now).
> Thread 9 Crashed:: tid_6e23
> 0 libsystem_c.dylib 0x05890724 __abort + 230
> 1 libsystem_c.dylib 0x0589063e abort + 173
> 2 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x003875f8 log_callback(char const*, char const*, char const*, int, void*) + 88 (runtime.m:1112)
> 3 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0035922a log_adapter + 154 (mono-logger.c:259)
> 4 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0036ba67 monoeg_g_log + 103 (goutput.c:114)
> 5 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0035c9c0 check_info + 96 (mono-threads-coop.c:93)
> 6 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0035c6fd mono_threads_enter_gc_safe_region_unbalanced_with_info + 45 (mono-threads-coop.c:226)
> 7 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0035c63a mono_threads_enter_gc_safe_region_with_info + 26 (mono-threads-coop.c:200)
> 8 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x0035c60b mono_threads_enter_gc_safe_region + 27 (mono-threads-coop.c:189)
> 9 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x00311a5f sgen_deregister_root + 47 (mono-coop-mutex.h:53)
> 10 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x002d2d0c sgen_thread_detach + 44 (sgen-mono.c:2324)
> 11 com.xamarin.monotouch-test.watchkitapp.watchkitextension 0x00360492 unregister_thread + 114 (mono-threads.c:430)
> 12 libsystem_pthread.dylib 0x05b0129d _pthread_tsd_cleanup + 537
> 13 libsystem_pthread.dylib 0x05b00e22 _pthread_exit + 108
> 14 libsystem_pthread.dylib 0x05aff354 _pthread_wqthread + 1298
> 15 libsystem_pthread.dylib 0x05afcf56 start_wqthread + 34
Complete crash report: https://gist.github.com/rolfbjarne/a109afe32bc0157ab34724669ae1efda
This is from jenkins: https://jenkins.mono-project.com/job/xamarin-macios-pr-builder/884/ (monotouch-test/watchOS), on top of https://github.com/xamarin/xamarin-macios/commit/c3748631732fc716584132ce24a6d4be7897db98.
This looks like a coop bug.
So this is pretty interesting. unregister_thread can be called from two different places:
(1) explicitly when we call mono_thread_info_detach
(2) implicitly when the thread_info tls key is destroyed if it's non-NULL (posix)
In (1) we first call unregister_thread and then set the tls key to NULL
In (2) the way posix works is it first sets the key to NULL and then passes the old value to unregister_thread.
So anything function called by unregister_thread shouldn't use mono_thread_info_current and mono_thread_info_current_unchecked.
(The immediate cause of the crash is MONO_GC_ENTER_SAFE which ought to be a MONO_GC_ENTER_SAFE_WITH_INFO(info); it's a bit hard to hit because the gc_mutex has to be contested and you have to be one of these native threads that get indirectly attached because they call a delegate)
I'm going to stick some garbage value in the thread_info tls key in mono_thread_info_detach and see how much crashy code I get in unregister_thread.
A separate question is whether the GC mutex needs to be a MonoCoopMutex - we crash because we're trying to change the coop state of a thread that's being detached; in fact we want to unregister it as a root.
Maybe sgen_gc_lock/unlock calls should be wrapped in MONO_GC_ENTER/EXIT_SAFE calls, not their internals. (And then the detach code could take the mutex without changing the thred state)
Fixed on mono master a232eb3b84104dfc5cb04481285fa9458db67761.
Fixed on mono-4.6.0-branch e571108acf8c56bb28a4c6169abfdc10abb1fc51.
Interestingly, without the fix my test crashes reliably on the 4.6 branch, but not on master; I wonder if we hold the locks longer and are more likely to contest.
PR https://github.com/xamarin/xamarin-macios/pull/609 was merged. Closing for QA verification
There's no known repro, since the issue is random, so I'll just mark as verified and then I'll reopen if I see the issue again.