Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report for Bug 60872 on
GitHub or Developer Community if you have new
information to add and do not yet see a matching new report.
If the latest results still closely match this report, you can use the
In special cases on GitHub you might also want the comments:
GitHub Markdown with public comments
Using the script below, a service that is managed by mono run-time starts using all the CPU, has 200+ CLOSE_WAIT tcp connections which do not close, and becomes non responsive. The issue starts with a sudden increase in CPU usage and thread count (thread count increases slower than CPU but the onset point is the same - same time), an increase in CLOSE_WAIT() sockets durring that time is present (if close() syscall is issued w/ gdb then the execution returns back to normal);
ab -n 10000000 -c 100 -k http://$IP:55558/resourceIdEncoder/encodeCustomerId?CustomerId=256 &> /dev/null &
kill -2 $PID;
- Launch the service with mono;
- Execute the scipt (above) choosing that service as a target;
- Wait till the problem appears (~ 32-26 hours);
- CLOSE_WAIT socket count is 200+ (a sudden increase is noticeable);
- CPU is pegged @ 100%
[root@xxx ~]# /root/mono-debug-install-dir/bin/mono-sgen -V
Mono JIT compiler version 18.104.22.168 (tarball Tue Nov 14 04:43:11 EST 2017)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
LLVM: supported, not enabled.
GC: sgen (concurrent by default)
GDB: info threads:
GDB: thread apply all bt:
GDB: mono_stacktrace 25 (for the Threadpool worker):
Guidence on how to further troubleshoot would be appreciated.
P.S. I can not share the service code that the issue appears with, I'm currently writing a simpler service that I could share with the community and wouldn't be proprietary. In the mean time it would be great to hear some feedback on possible inspection / troubleshooting options.
*** Bug 60871 has been marked as a duplicate of this bug. ***
The issue might possibly be with either the threadpool or the socket async code. To troubleshoot both, could you please run the service with the `MONO_LOG_MASK=threadpool,io-threadpool`, `MONO_LOG_LEVEL=debug` and `MONO_LOG_DEST=trace.log` environment variables defined, and share the `trace.log` file here?
Currently trying the --aot options, debug build of the hosted code and also the options suggested above. A couple of days needed.
output.mlpd - is not really useful for debugging when the bug occurs rarely. The .mlpd file is being loaded fully into ram, and when it is +64G it is useless.
[root@x ~]# mprof-report --traces --maxframes=2 output.mlpd
[root@x ~]# mprof-report --traces --maxframes=2 --time=0-360 output.mlpd
[root@x ~]# mprof-report --traces --maxframes=2 --time=0-6 output.mlpd
Here are the last 6000 lines.
The log output has stopped at this point, presumably soon when the issue appeared (not sure, since no timestamp is available there and my external socket monitoring tools were decommissioned due to moving of environments).
Does this give any insight? Any further troubleshooting tips?