Bug 53843 - Mono deadlocks on shutdown while waiting for a process which has died
Summary: Mono deadlocks on shutdown while waiting for a process which has died
Status: RESOLVED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: io-layer (show other bugs)
Version: master
Hardware: PC Mac OS
: --- normal
Target Milestone: 15.2
Assignee: Zoltan Varga
URL:
Depends on:
Blocks:
 
Reported: 2017-03-23 10:54 UTC by Alan McGovern
Modified: 2017-03-30 10:00 UTC (History)
6 users (show)

See Also:
Tags: 2017-02
Is this bug a regression?: ---
Last known good build:


Attachments

Description Alan McGovern 2017-03-23 10:54:26 UTC
I am 99.9% sure that the external process that is being waited on has actually exited because `ps ax`  does not list it anymore. I just can't tell if it exited before or after mono tried to shut down.

The code that deadlocked is this: https://github.com/spouliot/Touch.Unit/blob/fdeb19b3aca24ba57fab3bd9c0cc43fe59af1857/Touch.Server/Main.cs

This is the lldb dump of threads, including some mono_pmip for missing frames:
https://gist.github.com/alanmcgovern/38d3a49fe0191892aefeb712dadf0bd1


This is the last output from the test runner, in case it's needed at any point, but it's probably not necessary: https://gist.github.com/alanmcgovern/2a4ba1d3e1a5b7f7277e9b35a3383035
Comment 1 Alan McGovern 2017-03-23 10:55:14 UTC
This deadlocked in Mono JIT compiler version 4.9.3.43 (2017-02/9ecfd9f Sat Mar 18 02:18:24 EDT 2017), which is what we just upgraded to so it's likely that our tests are going to become unreliable now if this keeps showing up.
Comment 3 Zoltan Varga 2017-03-28 15:06:43 UTC
Can't repro it at all, the test ends with something like:

[mtouch stderr 10:48:42 AM] 
[mtouch stderr 10:48:53 AM] [2017-03-28 10:48:53.1] PERF: Total time for shutting down: 11016ms
[mtouch stderr 10:48:53 AM] 
[mtouch stdout 10:48:53 AM] csproxy successfully shut down
Comment 4 Vlad Brezae 2017-03-28 15:29:21 UTC
I merged 3-4 days ago several fixes for hangs like this. It would be nice to see if the hangs still happen with latest 2017-02.
Comment 5 Alan McGovern 2017-03-28 16:06:02 UTC
Applying this might help make the issue reliable: https://gist.github.com/alanmcgovern/0b4e9d300284cab303274005572cdb89

we are unable to get builds on Sierra due to https://bugzilla.xamarin.com/show_bug.cgi?id=54093, so we're blocked from testing anything at the moment as our compilation doesn't get past `nuget restore` most of the time :(
Comment 6 Bernhard Urban 2017-03-28 16:45:46 UTC
so, it isn't clear to me what runtime you're using.

Is it correct that you're using the version that would be installed by
```
./bot-provisioning/system_dependencies.sh --provision-mono --provision-xamios --provision-xammac --provision-xamandroid
```
?

for me, that pulls ac26b00ca41dc507467780e60a142b34ce2dfdf9 from mono/2017-02. You should bump it to ef4352cde5385e30c05319df5ae37fdae9845aef, that includes the fixes that Vlad is talking about.
Comment 7 Bernhard Urban 2017-03-28 16:56:28 UTC
looks like https://github.com/xamarin/md-addins/pull/1619 already did the bumping.
Comment 8 Alan McGovern 2017-03-28 17:15:06 UTC
we are unable to get builds on Sierra due to https://bugzilla.xamarin.com/show_bug.cgi?id=54093, so we're blocked from testing anything at the moment as our compilation doesn't get past `nuget restore` most of the time :(
Comment 9 Zoltan Varga 2017-03-28 17:45:58 UTC
I can repro with the patch in comment 5.

The stack trace of the stuck thread is:

"Threadpool worker" tid=0x0x700010699000 this=0x0x101f58508 , thread handle : 0x7ff4ebd16f70, state : waiting
  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Threading.WaitHandle.WaitOne_internal (intptr,int) [0x00008] in <04300341516a482b9708b764d58af7ca>:0
  at System.Threading.WaitHandle.WaitOneNative (System.Runtime.InteropServices.SafeHandle,uint,bool,bool) [0x00012] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/corlib/System.Threading/WaitHandle.cs:102
  at System.Threading.WaitHandle.InternalWaitOne (System.Runtime.InteropServices.SafeHandle,long,bool,bool) [0x00014] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/waithandle.cs:250
  at System.Threading.WaitHandle.WaitOne (long,bool) [0x00000] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/waithandle.cs:239
  at System.Threading.WaitHandle.WaitOne (int,bool) [0x00019] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/waithandle.cs:206
  at System.Threading.WaitHandle.WaitOne () [0x00000] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/waithandle.cs:222
  at System.Diagnostics.AsyncStreamReader.WaitUtilEOF () [0x00008] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/System/services/monitoring/system/diagnosticts/AsyncStreamReader.cs:349
  at System.Diagnostics.Process.WaitForExit (int) [0x00059] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/System/services/monitoring/system/diagnosticts/Process.cs:2580
  at System.Diagnostics.Process.WaitForExit () [0x00000] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/System/services/monitoring/system/diagnosticts/Process.cs:2605
  at (wrapper remoting-invoke-with-check) System.Diagnostics.Process.WaitForExit () [0x00032] in <37e5c8b1484647b7b65dc85a890cf350>:0
  at SimpleListener/<>c__DisplayClass30_1.<Main>b__17 (object) [0x00333] in /Users/vargaz/md-addins/Xamarin.Designer.iOS/external/Touch.Unit/Touch.Server/Main.cs:342
  at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (object) [0x0000d] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1306
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00071] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/executioncontext.cs:957
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [0x00000] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/executioncontext.cs:904
  at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00021] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1283
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00074] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:856
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in /private/tmp/source-mono-2017-02/bockbuild-2017-02/profiles/mono-mac-xamarin/build-root/mono-x86/mcs/class/referencesource/mscorlib/system/threading/threadpool.cs:1211
  at (wrapper runtime-invoke) <Module>.runtime_invoke_bool (object,intptr,intptr,intptr) [0x0001f] in <04300341516a482b9708b764d58af7ca>:0
Comment 10 Zoltan Varga 2017-03-28 17:54:59 UTC
To repro, I ran:

SDK_VERSION=10.3 XCODE_DEVELOPER_ROOT=/Applications/Xcode.app/Contents/Developer/ make test-ios-sim

In md-addins.
Comment 11 Zoltan Varga 2017-03-28 22:39:54 UTC
https://github.com/mono/mono/pull/4609
Comment 12 Zoltan Varga 2017-03-29 01:33:18 UTC
Should be fixed by this build:
https://wrench.internalx.com/Wrench/ViewLane.aspx?lane_id=4337&host_id=148&revision_id=878279

(5.0.0.22)

XS should be updated to use this version.
Comment 13 Alan McGovern 2017-03-30 10:00:41 UTC
XS just updated to 5.0.0.23. We'll keep an eye on this. Thanks!

Note You need to log in before you can comment on or make changes to this bug.