Bug 30308 - Waiting on external processes broken on 4.0.1+ monos on OSX, Linux, possibly others
Summary: Waiting on external processes broken on 4.0.1+ monos on OSX, Linux, possibly ...
Status: RESOLVED NOT_REPRODUCIBLE
Alias: None
Product: Runtime
Classification: Mono
Component: io-layer (show other bugs)
Version: unspecified
Hardware: PC Mac OS
: --- normal
Target Milestone: ---
Assignee: Alexander Kyte
URL:
Depends on:
Blocks:
 
Reported: 2015-05-21 12:03 UTC by Alexander Kyte
Modified: 2018-04-05 22:41 UTC (History)
3 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED NOT_REPRODUCIBLE

Description Alexander Kyte 2015-05-21 12:03:51 UTC
The ExitedEvent test causing failures on public jenkins has exposed that the external process ExitedEvent mechanism is very broken. 

On OSX it fails to fire when a process closes most of the time. Sometimes it runs. On linux it appears to run more often.

https://github.com/mono/mono/commit/e097a98f3bd9082c7d03fa580b604c0228b4ad67

This commit hides a problem with our WaitHandle implementation for external processes. We will return WAIT_FAILURE for external processes, which triggers the callback. By having the callback reschedule itself, this busy waiting causes a lot of allocation. Sgen's efforts to handle us rapidly creating an absurd amount of garbage led to the timeouts that exposed this bug.

Busy waiting should not be calling the callback at all, it should be using a timer of some sort.