Bugzilla – Bug 2190
sdb interrupt code not signal safe
Last modified: 2012-03-15 22:21:52 EDT
Created attachment 944 [details]
t a a bt
See attached stack traces for all threads.
Zoltan, any ideas about this sdb deadlock?
This is a runtime bug. The stack walking code calls into the metadata code
which tries to lock the loader lock, which shouldn't happen since the stack
walking code needs to be async safe.
Fixed by 614da57287da7556581f1e0c6ce5fbccf93718b5 on the mono-2-10 branch.
This is the same issue, but it will be much hard to fix this time.
The problem here is that the MonoJitInfo structures for AOT methods are
allocated lazily to speed things up, this process is not signal safe, but it is
called from mono_jit_info_table_find (), which is supposed to be signal safe.
The last stack trace is for the original issue which should already be fixed by
46514d58f50b4830fc5f440305284e on mobile-master.
Right, that patch was never backported to the mobile-master branch MT stable is
Actually, I'm wrong, I got mixed up.
Zoltan, it seems like you fixed half the issue in exceptions-x86.c. In
mono_arch_find_jit_info there were two blocks that called
mono_arch_get_argument_info, in that patch you removed one (line ~821), but
there is another one left (line ~880). You can also see this from the stack
trace, the exceptions-x86.c frame has a different line number.
Yeah, I missed that. That case is also hard to fix similar to the second issue.
Created attachment 1525 [details]
another stack trace
This problem is an almost daily source of extreme pain for my co-workers and me
as well. I've attached an example of one of our stack traces.
Will try to find a solution.
Added a workaround in 06ecae6a4712e27c95153d4b9d0c60a89fa82b57.