Bug 19913 - Timeout in HttpWebRequest.GetResponse() when called repeatedly
Summary: Timeout in HttpWebRequest.GetResponse() when called repeatedly
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System (show other bugs)
Version: 3.4.0
Hardware: Other Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-05-19 22:48 UTC by David Straw
Modified: 2017-10-12 12:29 UTC (History)
8 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
--trace=N:System.Net output for call that hangs (9.54 KB, application/octet-stream)
2014-05-20 00:34 UTC, David Straw
Details
New test case (2.77 KB, text/plain)
2014-05-22 18:05 UTC, David Straw
Details

Description David Straw 2014-05-19 22:48:49 UTC
We found that too many quick calls to HttpWebRequest.GetResponse() causes the request not to be sent, and the call blocks until the configured timeout is reached.

I reproduced this easily in a small program by making requests to a noop server:

while (true)
{
    var client = (HttpWebRequest)WebRequest.Create("http://localhost:5000/");
    client.Timeout = (int)TimeSpan.FromSeconds(10);
    client.GetResponse();
}

The first few (6-8) requests succeed quickly, and then calls fail with a timeout error despite not having sent the request to the server. Once in this state it appears that the framework cannot recover, so all calls fail due to timeout from that point on until the application restarts. The value of the timeout doesn't matter, and if no timeout is specified the call fails after around 100 seconds (which I think is the default).

I also tried using the GetResponseAsync() method and that one exhibits the same behavior except that it never times out, so the first call that runs into this situation hangs forever.
Comment 1 David Straw 2014-05-20 00:34:27 UTC
Created attachment 6840 [details]
--trace=N:System.Net output for call that hangs

I ran mono with the --trace=N:System.Net option, and this log is the bottom portion of the log right after the last successful web request.
Comment 2 David Straw 2014-05-20 10:30:20 UTC
After looking at the trace I think what might be happening is that the request gets queued in WebConnection.cs, line 819. I see no obvious place where this queue is being processed in this scenario, so the request probably just sits in the queue indefinitely.
Comment 3 David Straw 2014-05-20 15:48:55 UTC
In my code above, I missed the TotalMilliseconds property, so the code should be:

while (true)
{
    var client = (HttpWebRequest)WebRequest.Create("http://localhost:5000/");
    client.Timeout = (int)TimeSpan.FromSeconds(10).TotalMilliseconds;
    client.GetResponse();
}

Also I found where the SendNext should be called, which is on Close of the previous connection. However I don't know yet whether or not it's actually being called in this case.
Comment 4 David Straw 2014-05-20 18:24:37 UTC
We figured out that the test case is invalid. According to the .NET documentation, failure to close the response can lead to running out of connections. This bug can be closed as not an issue. Adding a call to Close in the test code makes it all work just fine.
Comment 5 David Straw 2014-05-22 16:05:35 UTC
I added some code to Mono in my local system to watch for any connections that are left open for more than 10 seconds and reproduced the issue using our application. The new code did not detect any connections left open, so somehow the connections are being used up or getting hung some other way.

This leads me to believe there is in fact a defect in Mono that is causing this despite the earlier test case being invalid.
Comment 6 David Straw 2014-05-22 18:05:17 UTC
Created attachment 6866 [details]
New test case

Here's a new test case. This works fine in .NET but in Mono many of the calls fail with a WebException "The request timed out". After the first iteration of the while loop almost all calls fail with the timeout error.
Comment 7 David Straw 2014-05-22 18:11:40 UTC
It's also worth noting with the attached test case that you can run into errors in .NET by kicking off ~1000 requests before calling EndGetResponse on any of them. However most of the calls still succeed, and the error in that case is a WebException "Unable to connect to the remote server".

I can reproduce the timeout in Mono with as few as 4 outstanding requests, and 5 hits the issue regularly.
Comment 8 Miguel de Icaza [MSFT] 2014-05-23 10:11:16 UTC
You are starving the ThreadPool:

http://www.mono-project.com/Article:ThreadPool_Deadlocks
Comment 9 David Straw 2014-05-23 12:47:44 UTC
@Miguel -

With this new test case I run into the issue with only a handful of threads (I saw 9 when I attached in gdb). Setting MONO_THREADS_PER_CPU to 2000 does not change the behavior. In our full application we've been running with MONO_THREADS_PER_CPU=200 and I observed 214 threads at the time the issue occurred, with two logical processors given to the VM.
Comment 10 smremde 2014-08-29 11:50:31 UTC
Any updates on this?
Comment 11 smremde 2014-08-29 16:39:50 UTC
I'm fairly sure I am experiencing this bug. It only started when I did a distribution release update (to Ubuntu 12.04 to 14.04 - Mono 2.10.8 to 3.2.8).

I'm not sure if that helps narrow down the problem.
Comment 12 David Straw 2014-10-30 11:17:16 UTC
I think I finally have some understanding of what is going on here. In .NET, the timeout value on a request is not in effect until the request gets sent. Therefore you see some calls to GetResponse() taking longer than the configured timeout if the request can't be made for a while due to heavy load.

In Mono on the other hand, the timeout is in effect as soon as GetResponse is called. So if a request needs to wait for a while, you could see a timeout error before the request gets sent, or at least sooner than you would see it in .NET.

The workaround we plan to implement is to throttle requests at a higher level to avoid request queuing in Mono so that we never (or at least very rarely) get into this situation. That said it would be best if Mono behaved like .NET in this scenario.
Comment 13 Christian Melendez 2015-07-26 00:27:55 UTC
As for today, we are facing the same problem running on Ubuntu 14.04 and Mono 3.2.8, Does any one have found out how to fix or avoid this? We also have a configuration for MONO_THREADS_PER_CPU

Thanks in advance :)
Comment 14 Krishan Senevirathne 2016-02-02 11:50:19 UTC
Same issue is experienced on Ubuntu 14.04 and Mono 3.2.8
Is this fixed in a later version?
Comment 15 Stefan K 2016-09-08 15:45:04 UTC
I got the same problem using Web References with Xamarin Android 6.1.2.21. Two or three requests work well, then there are only WebExceptions due to timeouts. Restart the app and all works again. Priority should probably be higher, since this is an elementary function for many projects.
Comment 16 Stefan K 2016-09-09 13:44:26 UTC
One quick remark: I solved my problems downgrading to Xamarin 4.0.3 (Xamarin.Android 6.0.3.5).
Comment 17 Stefan K 2016-09-09 14:40:58 UTC
The error seem to be introduced again since Xamarin 4.1

4.1.2.18 = Problem
4.1.0 = Problem
4.0.4.4 = OK
4.0.3 = OK
Comment 18 Stefan K 2016-09-12 12:11:35 UTC
The error seems to be adressed in the latest beta release Xamarin 4.2.0.675.
https://releases.xamarin.com/beta-preview-4-cycle-8/

The release notes state:

42864 –  “System.Net.WebException: Error: NameResolutionFailure” on second web request to certain raw IP addresses with HttpClient

I tested it and it seem to adress this problem, too.
Comment 19 smremde 2016-11-18 18:07:20 UTC
Bug has come back in version 4.2.1 (Debian 4.2.1.102+dfsg2-7ubuntu4)

Was working before I update my distribution to xenial from trusty.
Comment 20 Stefan K 2017-01-12 19:15:41 UTC
This sees to be related to https://bugzilla.xamarin.com/show_bug.cgi?id=45761
which is in the latest Stable Release: Cycle 8 Service Release 2:

[Mono Framework] – 45761 – After network reconnected, web request fails for a couple of minutes with a NameResolutionFailure
https://bugzilla.xamarin.com/show_bug.cgi?id=45761

And may be fixed in Alpha Preview 7: Cycle 9. But I have not found te time to test this, since Alpha Preview 7: Cycle 9 has other issues with Android and appcompat in my project.
Comment 21 Marek Safar 2017-10-12 12:29:26 UTC
Could you please try to update to any recent version and try to reproduce the issue again.

If the issue still persists please include the version information and change the bug status to NEW.

Note You need to log in before you can comment on or make changes to this bug.