Bug 58770 - * Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met
Summary: * Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: JIT (show other bugs)
Version: master
Hardware: PC All
: --- normal
Target Milestone: ---
Assignee: johan.lorensson
URL:
Depends on:
Blocks:
 
Reported: 2017-08-15 13:28 UTC by Marek Safar
Modified: 2017-10-19 19:35 UTC (History)
5 users (show)

Tags: bugpool-archive
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Marek Safar 2017-08-15 13:28:17 UTC
This happens randomly during the build

https://jenkins.mono-project.com/job/x/4698/consoleFull#320155386de733c27-8bf9-43f1-b986-604599a42e3b

* Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met

make[9]: *** [../../build/library.make:315: ../../class/lib/net_4_x-win32/PEAPI.dll] Error 127
Comment 1 Ludovic Henry 2017-09-01 18:38:16 UTC
We don't have a reliable reproduction case, but it's recurrent on CI.
Comment 2 johan.lorensson 2017-09-15 12:04:57 UTC
I have made the repro of this issue in the debugger. It is a race between main thread shutdown in mono_thread_manage and threads running mono_thread_detach_internal during shutdown. The problem is the removal from the threads list since mono_thread_manage will pick threads to wait upon from that list. If a thread removes itself from the list mono_thread_detach_internal it will then race against the shutdown of the runtime that will in turn terminate everything including the GC, invalidating the MonoInternalThread pointer used in mono_thread_detach_internal, that could then trigger the assert seen above or other undefined behavior depending on where the threads resume execution after main thread has shutdown runtime, but not yet terminated the process.

Steps to reproduce (in debugger):

* Run one of the failing compiling tests from above.
* Set breakpoint before runtime returns from C main method.
* Set a breakpoint at the end of mono_thread_detach_internal on the call to mono_thread_info_unset_internal_thread_gchandle.
* Set a breakpoint at threads.c mono_thread_manage:3334, on the call to mono_threads_lock.
* Run test under debugger. When main thread hits breakpoint freeze thread and continue execution.
* When worker threads hits breakpoint in mono_thread_detach_internal, freeze each worker thread until no more are active in the process.
* Switch back and resume main thread.
* NOTE, when finalizer thread hits breakpoint in mono_thread_detach_internal, just continue execution of the finalizer thread.
* When main thread hits breakpoint just before returning from main method, observe that all worker threads are still around.
* Freeze main thread (to prevent process from terminating).
* Pick on worker thread and resume execution.
* Worker thread fails with above assertion.

The fix is to make sure that threads accessing info from MonoInternalThread in mono_thread_detach_internal after they removed themselves from the threads list are still waited upon before runtime terminates the GC. For externally attached thread, there is no way to know that and it must be on the integrators responsibility to make sure they are completed before initialize runtime shutdown. For internal threads like the thread pool threads causing the problems in this case, we could add them to the joinable thread list, once removed from the threads list (but while still keeping the threads list lock). That will make sure runtime won’t finalize the shutdown until all joinable threads have completed. This feature is currently not implemented on Windows but probably should be and could then be used to solve the problem, using a mechanism already in use by the runtime.
If (for some reason) we can’t use the joinable wait list, we need to add a separate list for threads shutting down and make sure we wait for them to terminate during runtime shutdown.
I will make a prototype on the joinable thread solution and see how far that will takes us to fixing this shutdown race condition.
Comment 3 johan.lorensson 2017-09-18 15:25:11 UTC
Proposed fix in https://github.com/mono/mono/pull/5599.