Bug 59779 - HttpClient, when using GZIP, hangs while sending multiple requests in parallel.
Summary: HttpClient, when using GZIP, hangs while sending multiple requests in parallel.
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.Net.Http (show other bugs)
Version: master
Hardware: PC All
: --- normal
Target Milestone: Future Release
Assignee: Martin Baulig
Depends on:
Reported: 2017-09-27 08:22 UTC by Vladimir Kazakov
Modified: 2018-05-15 10:50 UTC (History)
5 users (show)

Is this bug a regression?: ---
Last known good build:

The solution with a project that allows to reproduce the issue. (3.95 KB, application/x-zip-compressed)
2017-09-27 08:22 UTC, Vladimir Kazakov

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Vladimir Kazakov 2017-09-27 08:22:07 UTC
Created attachment 24951 [details]
The solution with a project that allows to reproduce the issue.

When using HttpClient that has a handler that enables GZIP (new HttpClient(new HttpClientHandler { AutomaticDecompression = DecompressionMethods.GZip })), running multiple tasks in parallel that download some data (await HttpClient.GetAsync("https://jsonplaceholder.typicode.com/photos")) results in hanging until the HttpClient's timeout is reached. On my machine (64-bit Ubuntu 16.04 with all updates; Mono, I see that the maximum amount of tasks that can be successfully run in parallel is equal to 2. Any number above 2 results in hanging. This doesn't happen on Windows / .NET Framework. This also doesn't happen if GZIP is not used, which may mean that something is wrong with HttpClientHandler. I attached the solution with the code that reproduces the issue.
Comment 1 Katelyn Gadd 2017-10-11 06:36:56 UTC
The limit of two requests is due to https://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.defaultconnectionlimit(v=vs.110).aspx, so that's expected. Naturally the hang is not expected, but adjusting that value should allow you to get more parallel requests in general across both Mono and Windows .NET Framework if that matters for your use case.

I've verified that this occurs regardless of the use of HTTPS or HTTP, and it appears to happen against different web server software as well (including Amazon's CDN).
Comment 2 Katelyn Gadd 2017-10-20 01:17:20 UTC
I investigated this for a while but haven't figured out how to fix it. After digging around and instrumenting a bunch of code, I believe the cause of the reported issue is that the queueing logic in WebConnection and WebConnectionStream is not properly recognizing the end of the response's content, so it never fires WebConnection.NextRead to advance to the next request in the queue. The easiest way to reproduce this is to set the connection limit to 1, so that all the tasks are queued onto a single connection.

Note that in this case you can get a harmless race in some of the queueing logic, where the connection will be selected as "idle" multiple times before all the tasks race to actually start an operation on it - I think this is harmless and unrelated to the cause of this problem though.

With the connection limit set to 1 all the tasks get queued up on it while the first task begins and completes its request. As far as I can tell all of the relevant code is running, but the various branches and tests responsible for advancing the queue never decide to do it.

One piece of code that would normally be involved in firing NextRead is the WebConnectionStream.EndRead method (used here because of async i/o). It compares the total read bytes with the content length to decide whether to advance the queue, and for this test case (with gzip content), the content length is always a huge garbage int64 value. This remains the case even if I pull gzipped content of known length from Amazon's CDN, so I do not think this indicates a server bug, and I am pretty sure the content length is being sent by the server. There is some logic designed to automatically set the content length when we stop receiving new bytes from the server, as well, but that does not appear to be running when gzip is enabled. If I disable gzip, that logic appears to run, causing EndRead to invoke ReadAll at the end of the stream (which results in the queue advancing instead of stalling). Oddly enough the content length value from the headers is not respected even if gzip is turned off, but the logic to set it at end-of-stream works.
Comment 3 Martin Baulig 2018-01-08 11:01:01 UTC
The entire NextRead and queuing logic is gone - we are now using ServicePointScheduler, which should be much more robust.

We need to evaluate
a) whether this problem still exists when using the new web stack
b) why this only affects GZip (does it?)
Comment 4 Martin Baulig 2018-01-29 14:35:31 UTC
PR is up: https://github.com/mono/mono/pull/6662
Comment 5 Martin Baulig 2018-01-29 14:42:00 UTC
This is not related to HttpClient, you'll get the same issue with HttpWebRequest.

The problem is that WebResponseStream needs to do extra bookkeeping after the entire content has been read.  It has two modes of operation - we either have a Content-Length or we're using chunked encoding.  With chunked encoding, you read data from the remote until encountering chunk trailer.

The old implementation wrapped WebResponseStream inside GZipStream / DeflateStream - since GZip/Deflate knows about the size of the compressed data, it doesn't trigger another read on the inner stream after it has received all that data.  This creates a situation where the chunk trailer may never be read, so the connection is kept open instead of being either closed or flagged for reuse.

PR #6662 should fix this.
Comment 6 Martin Baulig 2018-02-01 22:42:12 UTC
Fixed in https://github.com/mono/mono/pull/6662.
Comment 7 Martin Baulig 2018-02-02 23:08:35 UTC
I had to reopen this because the issue is not actually fixed yet.

See my comment on the PR:
Comment 8 Martin Baulig 2018-02-11 17:29:52 UTC
I am working on this, please don't touch this bug.
Comment 9 Martin Baulig 2018-02-11 17:38:33 UTC
This bug has triggered a large rewrite and cleanup of the WebResponseStream class - something that I already wanted to do as part of the new web stack, but couldn't finish due to time constraints.
Comment 10 Marek Safar 2018-02-21 22:27:39 UTC
Martin, what's the update on this issue?
Comment 11 Marek Safar 2018-05-15 10:50:49 UTC
Fixed by https://github.com/mono/mono/pull/8569