Bug 39042

Summary: appdomain-unload.exe sometimes hangs in CI
Product: [Mono] Runtime Reporter: Alexander Köplinger [MSFT] <alkpli>
Component: GeneralAssignee: Vlad Brezae <vlad.brezae>
Severity: normal CC: mono-bugs+mono, mono-bugs+runtime
Priority: ---    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Tags: Is this bug a regression?: ---
Last known good build:
Attachments: Repro app and gdb output

Description Alexander Köplinger [MSFT] 2016-02-23 14:20:03 UTC
Created attachment 15149 [details]
Repro app and gdb output

This is the last of the runtime tests that are sometimes flaky on Jenkins. It'd be great to have this finally fixed.

I reduced the test case to the attached sample.

Run it in a loop and after a while it'll hang. I've attached a gdb output.

By running with exception tracing on, I found out that when the hang happens, there's no ThreadAbortException happening in the Timer scheduler thread so it keeps spinning in https://github.com/mono/mono/blob/9f6238791f8b9e26bf952add917a93045cf07817/mcs/class/corlib/System.Threading/Timer.cs#L334

Comment from Ludovic from a while back:
> I wouldn't be surprised it's a race in Thread.Abort,
> like the on @bernhard.urban fixed in Thread.Suspend last week.
> the call to `abort_thread_internal` out of a `LOCK_THREAD(thread)` / `UNLOCK_THREAD(thread)`
> seems suspicious to me (that's just a guess though).

Environment: Ubuntu 14.04 x64 (didn't try on other configs)

Vlad: since you said yesterday you're happy for runtime bugs, I thought I'd give you a present and assign this to you :)
Comment 1 Alexander Köplinger [MSFT] 2016-02-29 15:25:08 UTC
It does seem to happen a lot more frequent with llvm Mono, e.g. if you look at the runtime-llvm step in https://wrench.internalx.com/Wrench/ViewTable.aspx?lane_id=2457&host_id=148 almost all of the recent failures are due to appdomain-unload.exe timing out.