This is Xamarin's bug tracking system. For product support, please use the support links listed in your Xamarin Account.
Bug 40306 - Native crash in System.Net.Sockets.Socket:Close_internal()
Summary: Native crash in System.Net.Sockets.Socket:Close_internal()
Status: VERIFIED FIXED
Alias: None
Product: Runtime
Classification: Mono
Component: misc (show other bugs)
Version: unspecified
Hardware: PC Mac OS
: High critical
Target Milestone: (C7)
Assignee: Ludovic Henry
URL:
Depends on:
Blocks:
 
Reported: 2016-04-12 14:43 UTC by Martin Baulig
Modified: 2016-06-01 19:16 UTC (History)
6 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Martin Baulig 2016-04-12 14:43:49 UTC
Using mono-4.4.0-branch commit 10dcc60ae1834aeb99c4afdcd17e4588feb797d1 and web-tests/stable commit 323643b64a4e1696b6c38d7dbf6a7fa99c45412e, I'm getting a native crash.

To reproduce, checkout https://github.com/xamarin/web-tests/tree/stable, then do

$ make Console-Build-Debug

to build and either

$ make Console-Run-Martin

or

$ lldb --args /Workspace/STABLE/bin/mono ./Console/Xamarin.WebTests.Console/bin/Debug/Xamarin.WebTests.Console.exe --features=+Experimental --debug --log-level=5  --category=Martin

This will run all the tests, print the result, then when exiting it crashes:

=====
RUN DONE: True [TestResult: Name=[Xamarin.WebTests.Console], Status=Error]
Got result: [TestResult: Name=[Xamarin.WebTests.Console], Status=Error]
2 tests, 0 passed, 6 errors, 0 ignored.
Total time: 00:00:00.7082140.
Result writting to TestResult.xml.
Process 37888 stopped
* thread #3: tid = 0x54e538, 0x00007fff94d8f4a1 libsystem_pthread.dylib`pthread_mutex_unlock, name = 'tid_360b', stop reason = EXC_BAD_ACCESS (code=1, address=0xc30)
    frame #0: 0x00007fff94d8f4a1 libsystem_pthread.dylib`pthread_mutex_unlock
libsystem_pthread.dylib`pthread_mutex_unlock:
->  0x7fff94d8f4a1 <+0>:  cmpq   $0x4d55545a, (%rdi)       ; imm = 0x4D55545A 
    0x7fff94d8f4a8 <+7>:  jne    0x7fff94d8f512            ; <+113>
    0x7fff94d8f4aa <+9>:  leaq   0x1f(%rdi), %r9
    0x7fff94d8f4ae <+13>: andq   $-0x8, %r9
(lldb) bt
* thread #3: tid = 0x54e538, 0x00007fff94d8f4a1 libsystem_pthread.dylib`pthread_mutex_unlock, name = 'tid_360b', stop reason = EXC_BAD_ACCESS (code=1, address=0xc30)
  * frame #0: 0x00007fff94d8f4a1 libsystem_pthread.dylib`pthread_mutex_unlock
    frame #1: 0x00000001001b23b4 mono`mono_threadpool_ms_io_remove_socket [inlined] mono_os_mutex_unlock + 196 at mono-os-mutex.h:85
    frame #2: 0x00000001001b23af mono`mono_threadpool_ms_io_remove_socket [inlined] mono_coop_mutex_unlock + 7 at mono-coop-mutex.h:71
    frame #3: 0x00000001001b23a8 mono`mono_threadpool_ms_io_remove_socket(fd=<unavailable>) + 184 at threadpool-ms-io.c:624
    frame #4: 0x00000001001985c7 mono`ves_icall_System_Net_Sockets_Socket_Close_internal(sock=10, error=<unavailable>) + 23 at socket-io.c:692
    frame #5: 0x00000001057ecc16
    frame #6: 0x00000001034f83e4 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_DangerousReleaseInternal_bool(this=0x000000000000000a, dispose=true) + 244 at SafeHandle.cs:215
    frame #7: 0x00000001034f8295 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_InternalDispose(this=0x0000000102070550) + 37 at SafeHandle.cs:150
    frame #8: 0x00000001034f810f mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_Dispose_bool(this=0x0000000102070550, disposing=true) + 31 at safehandle.cs:260
    frame #9: 0x00000001034f80e7 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_Dispose(this=0x0000000102070550) + 23 at safehandle.cs:252
    frame #10: 0x00000001057ec758
    frame #11: 0x00000001075acaa3
(lldb) monobt
* thread #3
  * frame #0: 0x00007fff94d8f4a1 libsystem_pthread.dylib`pthread_mutex_unlock
    frame #1: 0x00000001001b23b4 mono`mono_threadpool_ms_io_remove_socket [inlined] mono_os_mutex_unlock + 196 at mono-os-mutex.h:85
    frame #2: 0x00000001001b23af mono`mono_threadpool_ms_io_remove_socket [inlined] mono_coop_mutex_unlock + 7 at mono-coop-mutex.h:71
    frame #3: 0x00000001001b23a8 mono`mono_threadpool_ms_io_remove_socket(fd=<unavailable>) + 184 at threadpool-ms-io.c:624
    frame #4: 0x00000001001985c7 mono`ves_icall_System_Net_Sockets_Socket_Close_internal(sock=10, error=<unavailable>) + 23 at socket-io.c:692
    frame #5: 0x1057ecc16 (wrapper managed-to-native) System.Net.Sockets.Socket:Close_internal (intptr,int&) + 0x66 (0x1057ecbb0 0x1057ecc8c) [0x1005034f0 - Xamarin.WebTests.Console.exe]
    frame #6: 0x00000001034f83e4 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_DangerousReleaseInternal_bool(this=0x000000000000000a, dispose=true) + 244 at SafeHandle.cs:215
    frame #7: 0x00000001034f8295 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_InternalDispose(this=0x0000000102070550) + 37 at SafeHandle.cs:150
    frame #8: 0x00000001034f810f mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_Dispose_bool(this=0x0000000102070550, disposing=true) + 31 at safehandle.cs:260
    frame #9: 0x00000001034f80e7 mscorlib.dll.dylib`System_Runtime_InteropServices_SafeHandle_Dispose(this=0x0000000102070550) + 23 at safehandle.cs:252
    frame #10: 0x1057ec758 System.Net.Sockets.Socket:Dispose (bool) + 0x88 (0x1057ec6d0 0x1057ec761) [0x1005034f0 - Xamarin.WebTests.Console.exe]
    frame #11: 0x1075acaa3 System.Net.WebConnection:InitConnection (object) + 0x453 (0x1075ac650 0x1075acb90) [0x1005034f0 - Xamarin.WebTests.Console.exe]
======
Comment 1 Martin Baulig 2016-04-12 14:46:46 UTC
Walking up the stack:

(lldb) down
frame #3: 0x00000001001b23a8 mono`mono_threadpool_ms_io_remove_socket(fd=<unavailable>) + 184 at threadpool-ms-io.c:624
   621 	
   622 		mono_coop_cond_wait (&threadpool_io->updates_cond, &threadpool_io->updates_lock);
   623 	
-> 624 		mono_coop_mutex_unlock (&threadpool_io->updates_lock);
   625 	}
   626 	
   627 	void
(lldb) p threadpool_io
(ThreadPoolIO *) $4 = 0x0000000000000000
Comment 2 Rodrigo Kumpera 2016-04-12 17:32:00 UTC
Hey Ludo,

It's crashing on the TP.
Comment 3 Ludovic Henry 2016-04-12 18:54:12 UTC
It looks a lot like a race between threadpool cleanup and this call to Socket.Dispose.
I think safest thing to do in this case is going to: like the threadpool, when cleaning up, simply ensure the background threads are shut down, but not destroying the mutexes, condition variables and freeing memory.
Comment 4 Ludovic Henry 2016-04-18 17:03:27 UTC
This is fixed on master: https://github.com/mono/mono/pull/2916
Comment 5 Shruti 2016-05-18 09:41:46 UTC
I have tried to reproduce this issue but not able to reproduce it. I have followed steps of Comment(0), describing below :

1. Installed Mono MonoFramework-MDK-4.4.0.123.macos10.xamarin.universal_5a6466479498862b76c72e0c60be4ec3059d727a.
2. Clone  https://github.com/xamarin/web-tests/tree/stable manually. Because If I typo command in terminal > 'git Clone  https://github.com/xamarin/web-tests/tree/stable ' then getting error 'fatal: repository 'https://github.com/xamarin/web-tests/tree/stable/' not found. 
3. cd Web-tests-Stable
4. make Console-Build-Debug
5. make Console-Run-Martin

After this command, I am observing 6 errors but getting different errors. Also observed that In comment(0), results are written to TestResult.xml whether When I am getting results in TestResult-Console-Run-Martin.xml


Terminal Output: https://gist.github.com/shrutis360/74845ca5840c897d113879fdb4264a7b
Result of TestResult-Console-Run-Martin.xml file : https://gist.github.com/shrutis360/4bd68dbd33ec8015ad82923ec9b69998

I have checked it latest C7 and master builds as well but getting same result. I used following builds:
C7 builds:
MonoFramework-MDK-4.4.0.123.macos10.xamarin.universal_5a6466479498862b76c72e0c60be4ec3059d727a
MonoFramework-MDK-4.4.0.128.macos10.xamarin.universal_10dcc60ae1834aeb99c4afdcd17e4588feb797d1
MonoFramework-MDK-4.4.0.168.macos10.xamarin.universal

Master builds:
MonoFramework-MDK-4.5.1.595.macos10.xamarin.universal_7e899a93fac5974480728b1d25d24207c9ffabcd

Please let me know what I missed to reproduce this issue.

Thanks!!
Comment 6 Peter Collins 2016-05-27 17:22:23 UTC
I'm able to reproduce this specific error when I install the specified mono mentioned in the description:
> http://storage.bos.internalx.com/mono-4.4.0/10/10dcc60ae1834aeb99c4afdcd17e4588feb797d1/MonoFramework-MDK-4.4.0.128.macos10.xamarin.universal.pkg
and checkout web-tests to 323643b64a4e1696b6c38d7dbf6a7fa99c45412e:
https://gist.github.com/pjcollins/fb79e3df1922be39f76896cced1fe1db


However, if I run the same test case against our RC2 (mono-4.4.0-branch/0f5fdf, two commits before this fix was introduced), I can no longer reproduce. 

As a result the provided test case doesn't demonstrate any change of behavior before or after the fix mentioned in comment #4 was provided. I also see a comment from Kumpera on the original PR (https://github.com/mono/mono/pull/2916) that mentions that this change has worsened behavior. 

Would it be wise to consider reverting this on the mono-4.4.0 release branch and try to provide a more desirable fix in an upcoming release?
Comment 7 Peter Collins 2016-06-01 19:16:25 UTC
After further investigation it appears my initial report was not thorough enough, given the nature of the issue. Here are the results of 10 sequential runs against a few mono versions listed in the bug:

10 runs against reported version (mono-4.4.0-branch/10dcc60a):
> 6 crash
> 4 no crash
10 runs against C7 RC2 (mono-4.4.0-branch/0f5fdf2):
> 5 crash 
> 5 no crash
20 runs against the build with the fix (mono-4.4.0-branch/c31aa7e):
> 0 crash
> 20 no crash

Note You need to log in before you can comment on or make changes to this bug.