Bug 39924 - System.Data.SqlClient connection pool unusable after network outage
Summary: System.Data.SqlClient connection pool unusable after network outage
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.Data ()
Version: 4.2.0 (C6)
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
Depends on:
Reported: 2016-03-26 13:09 UTC by Erwin
Modified: 2018-02-22 22:24 UTC (History)
3 users (show)

Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Erwin 2016-03-26 13:09:24 UTC
This has been tested on Windows, Mono version 4.2.3, and the bug does not occur on there.

I build with Visual Studio .Net 4.5 framework, then run it under Linux (latest Raspbian JESSIE) with the same mono version.

To reproduce:

Make an application with a continues loop where it would:

- try Open a database connection with a new SqlConnection instance
- retry to connect on network failure
- try Query on the database
- retry to open the database connection on query failure (and close the connection aswel)
- Close the connection

Start it, then keep it running so it has made a few queries to the database and then simulate a network outage (by plugging out your LAN cable). Keep the outage longer than the timeout period set in the connection (30 seconds or so). Just play save and enable the outage for 1 minute. 

The application would try to reconnect the database even when it has no internet connectivity (while the application is still running), and continue with the loop.

After the internet connectivity is back, it creates and exception at the 'query on the database' part of our loop:"System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding." (again: this works fine with Mono on Windows! No exceptions there)

It does not hang on SqlConnection.Open(), it opens the connection, but that connection (again: new instance of SqlConnection) is unable to query the database. This keeps on happening as long as i keep the application running after the network outage. So my conclusion was to use SqlConnection.ClearPool after the above exception. This fixes the problem but is totally not expected behavior (imo).

It seems that the Sql connection pool gets corrupted as far as i see.
Comment 1 evengard 2017-09-15 11:35:39 UTC
Having similar problem but with different cause causing this "Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.". Sometimes my app really starts a VERY long query, and fails with the aforementioned message. After that I get a similar corruption, with messages like
Index was outside the bounds of the array.
at System.Data.SqlClient.SqlDataReader.GetOrdinal (System.String name)

(which doesn't make sense, but makes sense if we think about the connection getting somewhat corrupt, but still going back to the pool and failing when it is taken out of the pool for another job)

But anyway the cause of such behaviour is always 
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

No matter if it is a network error or just server taking too long to fetch the data. Still the connection gets corrupt - the corrupted connection gets reused and fails again, instead it should be probably destroyed instead and recreated by the pool.
Comment 2 evengard 2017-09-15 23:40:00 UTC
Seems like this chunk of code is the culprit here:

try {
// ...
} catch (TdsTimeoutException ex) {
	// If it is a timeout exception there can be many reasons:
	// 1) Network is down/server is down/not reachable
	// 2) Somebody has an exclusive lock on Table/DB
	// In any of these cases, don't close the connection. Let the user do it
	Connection.Tds.Reset ();
	throw SqlException.FromTdsInternalException ((TdsInternalException) ex);
// ...

Lines 425-432 in System.Data.SqlClient/SqlCommand.cs (and similar ones).

It is either that Connection.Tds.Reset (); works incorrectly, or that after all we need to close the connection here. Any way, the connection gets unstable from this point here it seems.
Comment 3 Marek Safar 2018-02-22 22:24:09 UTC
Mono 5.10 has significantly improved System.Data implementation which should resolve this issue. If you can still reproduce it please reopen the issue.