Bug 2190

Summary: sdb interrupt code not signal safe
Product: [Mono] Runtime Reporter: Rolf Bjarne Kvinge [MSFT] <rolf>
Component: DebuggerAssignee: Bugzilla <bugzilla>
Severity: normal CC: divil5000, mono-bugs+mono, rolf, taktaktaktaktaktaktaktaktaktak, vargaz
Priority: Highest    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows   
Tags: Is this bug a regression?: ---
Last known good build:
Attachments: t a a bt
another stack trace

Description Rolf Bjarne Kvinge [MSFT] 2011-11-24 08:07:54 UTC
Created attachment 944 [details]
t a a bt

See attached stack traces for all threads.
Comment 2 Rolf Bjarne Kvinge [MSFT] 2011-11-24 08:09:22 UTC
Zoltan, any ideas about this sdb deadlock?
Comment 4 Zoltan Varga 2011-11-24 11:34:18 UTC
This is a runtime bug. The stack walking code calls into the metadata code which tries to lock the loader lock, which shouldn't happen since the stack walking code needs to be async safe.
Comment 5 Zoltan Varga 2011-11-24 14:07:27 UTC
Fixed by 614da57287da7556581f1e0c6ce5fbccf93718b5 on the mono-2-10 branch.
Comment 6 Zoltan Varga 2011-11-24 14:07:44 UTC
Comment 8 Zoltan Varga 2011-12-16 19:39:24 UTC
This is the same issue, but it will be much hard to fix this time.
Comment 9 Zoltan Varga 2011-12-18 06:23:50 UTC
The problem here is that the MonoJitInfo structures for AOT methods are allocated lazily to speed things up, this process is not signal safe, but it is called from mono_jit_info_table_find (), which is supposed to be signal safe.
Comment 12 Zoltan Varga 2011-12-22 14:47:22 UTC
The last stack trace is for the original issue which should already be fixed by
46514d58f50b4830fc5f440305284e on mobile-master.
Comment 13 Rolf Bjarne Kvinge [MSFT] 2011-12-22 17:00:32 UTC
Right, that patch was never backported to the mobile-master branch MT stable is using.
Comment 14 Rolf Bjarne Kvinge [MSFT] 2011-12-22 17:07:04 UTC
Actually, I'm wrong, I got mixed up.

Zoltan, it seems like you fixed half the issue in exceptions-x86.c. In mono_arch_find_jit_info there were two blocks that called mono_arch_get_argument_info, in that patch you removed one (line ~821), but there is another one left (line ~880). You can also see this from the stack trace, the exceptions-x86.c frame has a different line number.
Comment 15 Zoltan Varga 2011-12-22 19:50:39 UTC
Yeah, I missed that. That case is also hard to fix similar to the second issue.
Comment 16 Scott Blomquist (sblom) 2012-03-15 19:30:48 UTC
Created attachment 1525 [details]
another stack trace

This problem is an almost daily source of extreme pain for my co-workers and me as well. I've attached an example of one of our stack traces.
Comment 18 Zoltan Varga 2012-03-15 20:08:56 UTC
Will try to find a solution.
Comment 19 Zoltan Varga 2012-03-15 22:21:52 UTC
Added a workaround in 06ecae6a4712e27c95153d4b9d0c60a89fa82b57.
Comment 20 Zoltan Varga 2014-11-10 18:50:35 UTC
This has been fixed a long time ago.