Bug 58770 - * Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met
Summary: * Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: JIT (show other bugs)
Version: master
Hardware: PC All
: --- normal
Target Milestone: ---
Assignee: johan.lorensson
URL:
Depends on:
Blocks:
 
Reported: 2017-08-15 13:28 UTC by Marek Safar
Modified: 2017-10-19 19:35 UTC (History)
5 users (show)

See Also:
Tags: bugpool-archive
Is this bug a regression?: ---
Last known good build:


Attachments

Description Marek Safar 2017-08-15 13:28:17 UTC
This happens randomly during the build

https://jenkins.mono-project.com/job/x/4698/consoleFull#320155386de733c27-8bf9-43f1-b986-604599a42e3b

* Assertion at ..\mono\utils\mono-threads.c:707, condition `info' not met

make[9]: *** [../../build/library.make:315: ../../class/lib/net_4_x-win32/PEAPI.dll] Error 127
Comment 1 Ludovic Henry 2017-09-01 18:38:16 UTC
We don't have a reliable reproduction case, but it's recurrent on CI.
Comment 2 johan.lorensson 2017-09-15 12:04:57 UTC
I have made the repro of this issue in the debugger. It is a race between main thread shutdown in mono_thread_manage and threads running mono_thread_detach_internal during shutdown. The problem is the removal from the threads list since mono_thread_manage will pick threads to wait upon from that list. If a thread removes itself from the list mono_thread_detach_internal it will then race against the shutdown of the runtime that will in turn terminate everything including the GC, invalidating the MonoInternalThread pointer used in mono_thread_detach_internal, that could then trigger the assert seen above or other undefined behavior depending on where the threads resume execution after main thread has shutdown runtime, but not yet terminated the process.

Steps to reproduce (in debugger):

* Run one of the failing compiling tests from above.
* Set breakpoint before runtime returns from C main method.
* Set a breakpoint at the end of mono_thread_detach_internal on the call to mono_thread_info_unset_internal_thread_gchandle.
* Set a breakpoint at threads.c mono_thread_manage:3334, on the call to mono_threads_lock.
* Run test under debugger. When main thread hits breakpoint freeze thread and continue execution.
* When worker threads hits breakpoint in mono_thread_detach_internal, freeze each worker thread until no more are active in the process.
* Switch back and resume main thread.
* NOTE, when finalizer thread hits breakpoint in mono_thread_detach_internal, just continue execution of the finalizer thread.
* When main thread hits breakpoint just before returning from main method, observe that all worker threads are still around.
* Freeze main thread (to prevent process from terminating).
* Pick on worker thread and resume execution.
* Worker thread fails with above assertion.

The fix is to make sure that threads accessing info from MonoInternalThread in mono_thread_detach_internal after they removed themselves from the threads list are still waited upon before runtime terminates the GC. For externally attached thread, there is no way to know that and it must be on the integrators responsibility to make sure they are completed before initialize runtime shutdown. For internal threads like the thread pool threads causing the problems in this case, we could add them to the joinable thread list, once removed from the threads list (but while still keeping the threads list lock). That will make sure runtime won’t finalize the shutdown until all joinable threads have completed. This feature is currently not implemented on Windows but probably should be and could then be used to solve the problem, using a mechanism already in use by the runtime.
If (for some reason) we can’t use the joinable wait list, we need to add a separate list for threads shutting down and make sure we wait for them to terminate during runtime shutdown.
I will make a prototype on the joinable thread solution and see how far that will takes us to fixing this shutdown race condition.
Comment 3 johan.lorensson 2017-09-18 15:25:11 UTC
Proposed fix in https://github.com/mono/mono/pull/5599.

Notice (2018-05-21): bugzilla.xamarin.com will be switching to read-only mode on Thursday, 2018-05-25 22:00 UTC.

Please join us on Visual Studio Developer Community and GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs and copy them to the new locations as needed for follow-up. The See Also field on each Bugzilla bug will be updated with a link to its new location when applicable.

After Bugzilla is read-only, if you have new information to add for a bug that does not yet have a matching issue on Developer Community or GitHub, you can create a follow-up issue in the new location. Copy and paste the title and description from this bug, and then add your new details. You can get a pre-formatted version of the title and description here:

In special cases you might also want the comments:

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Note You need to log in before you can comment on or make changes to this bug.