Bug 33740 - ExecuteReader Potential Threading Issues
Summary: ExecuteReader Potential Threading Issues
Status: NEW
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.Data (show other bugs)
Version: master
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2015-09-07 09:20 UTC by Chris Mcvittie
Modified: 2017-09-01 09:29 UTC (History)
3 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Chris Mcvittie 2015-09-07 09:20:24 UTC
Hi, 

Apologies for the uncertainty in this issue.

We've a production server running an HTTP API using ServiceStack.  It serves about 30,000 requests/day.  We're seeing strange behaviour after approximately (but not exactly) 48 hours.  The behaviour is that concurrent requests seem to get the responses mixed up.  We also see a high incidence of System.InvalidCastException in our logs.  Our application communicates with about 200 SQL Instances (i.e. different TCP Ports), I haven't yet determined if the mixed up data comes from different Instances to the Instance of the connection.  Anecdotally I don't think it is mixing up here, but I haven't been able to check thoroughly enough.

What I think is happening is that ExecuteReader is somehow returning the wrong reader.

Sometimes we query "SELECT id FROM table1"
Sometimes we query "SELECT name FROM table2"


System.InvalidCastException: Type is System.Int32
   at System.Data.SqlClient.SqlDataReader.GetString (Int32 i) in <filename unknown>:line 0
   at ServiceStack.OrmLite.OrmLiteReadExtensions.GetFirstColumn[T] (IDataReader reader) in <filename unknown>:line 0
   at ServiceStack.OrmLite.OrmLiteReadExtensions.Query[T] (IDbCommand dbCmd, System.String sql, System.Object anonType) in <filename unknown>:line 0
   at ServiceStack.OrmLite.OrmLiteReadConnectionExtensions+<>c__DisplayClass45`1[T].<Query>b__44 (IDbCommand dbCmd) in <filename unknown>:line 0
   at ServiceStack.OrmLite.ReadConnectionExtensions.Exec[T] (IDbConnection dbConn, System.Func`2 filter) in <filename unknown>:line 0
   at ServiceStack.OrmLite.OrmLiteReadConnectionExtensions.Query[T] (IDbConnection dbConn, System.String sql, System.Object anonType) in <filename unknown>:line 0

I think the reader swap is causing this exception - as the exception only appears in the logs when we see the mixed up responses. (We also see the inverse of this InvalidCastException).

I'm struggling to reproduce locally.  I've written a sample app that uses a loop to open connections to 70 SQL Server Instances, and run some queries.  When I introduce threading, everything seems to work as expected.

At this stage, I'd welcome any help/suggestions in trying to diagnose/reproduce the issue.  

We've had this on 4.2-alpha, and a weekly from mid July.

Thanks,
Chris
Comment 1 Andres G. Aragoneses 2015-09-09 16:16:02 UTC
Does it work with MS.NET?


> We've had this on 4.2-alpha, and a weekly from mid July.

Has this bug not been present in a previous version of Mono?
Comment 2 Chris Mcvittie 2015-09-09 16:27:27 UTC
It seems to run OK on MS.NET, we ran production under IIS for just over a week, and didn't encounter this problem.  

(Running IIS presented its own other problems though (e.g. fussier header checking, didn't work well with our deployment tools) that makes us not want to use it long term)

I can't say for certain about previous mono versions.  We ran for around 18 months on mono-2.11.8, but restarted quite regularly due to regularly deployments, and also had a lower throughput.  (Restarting seems to mask/sort the issue.) We had a couple of "odd" behaviours over that period, that a restart sorted, but they didn't manifest themselves so catastrophically - but I don't have any evidence to suggest it was the same thing.
Comment 3 Sebastian Mayr 2015-11-20 04:47:27 UTC
I'm not sure I'm experiencing the same bug, but it has the same symptoms, in that resultsets of different DataReaders just get mixed up... I think. It's 100% reproducible in my case though, which makes me think it's not a threading issue. Since some of our uses are wrapped in DataSets though, I'm having a hard time stepping through the whole thing and getting to the bottom of how it's caused.

However, it seems like this is a regression in 4.2 with the import of System.Data from referencesource - I built b491863 locally (which I think is the last tag before referencesource got imported) and that works fine.

The interesting thing is that simple code like this:

using (SqlConnection conn = new SqlConnection(connectionString))
using (SqlCommand cmd = new SqlCommand())
using (SqlDataAdapter da = new SqlDataAdapter())
        {
            conn.Open();
            cmd.Connection = conn;
            cmd.CommandText = @"$query";
            da.SelectCommand = cmd;
            DataSet ds = new DataSet();
            da.Fill(ds);
        }

Throws the following in da.Fill():

System.ArgumentException: Invalid cast from 'Int32' to 'DateTime'.Couldn't store <37> in dt Column. Expected type is DateTime. ---> System.InvalidCastException: Invalid cast from 'Int32' to 'DateTime'.
at System.Int32.System.IConvertible.ToDateTime (IFormatProvider provider) [0x00020] in :0
at System.Data.Common.DateTimeStorage.Set (Int32 record, System.Object value) [0x0003b] in :0 
  at System.Data.DataColumn.set_Item (Int32 record, System.Object value) [0x00000] in :0 
  --- End of inner exception stack trace ---
  at System.Data.DataColumn.set_Item (Int32 record, System.Object value) [0x00027] in :0 
  at System.Data.DataTable.NewRecordFromArray (System.Object[] value) [0x00042] in :0 

The errors seem totally random (which I guess is to be expected when resultsets get mixed up), but are reproducible (for me).

Wasn't sure if that's the same bug or I should open another one, so hoping this helps in this case too :S

Any idea how I can debug this further, or should I just stay at 4.0.5 until (hopefully) corefx SqlClient is imported?
Comment 4 Sebastian Mayr 2015-12-08 15:46:15 UTC
Just tested my test case with a git build of the mono 4.3.1 branch and it seems like my bug got fixed there.

Note You need to log in before you can comment on or make changes to this bug.