Bug 16730 - ConcurrentDictionary is not marked as Serializable
Summary: ConcurrentDictionary is not marked as Serializable
Alias: None
Product: Class Libraries
Classification: Mono
Component: mscorlib ()
Version: 3.2.x
Hardware: PC Windows
: Normal normal
Target Milestone: Untriaged
Assignee: Marek Safar
Depends on:
Blocks: 16768
  Show dependency tree
Reported: 2013-12-11 08:40 UTC by Grigory (Playtika)
Modified: 2014-11-25 05:00 UTC (History)
4 users (show)

Is this bug a regression?: ---
Last known good build:

Patch to fix ConcurrentDictionary not being serialisable. (2.79 KB, patch)
2013-12-12 22:18 UTC, Jon Hanna

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Grigory (Playtika) 2013-12-11 08:40:33 UTC
Type System.Collections.Concurrent.ConcurrentDictionary`2[System.String,System.String] is not marked as Serializable.
Comment 1 Marek Safar 2013-12-12 05:34:36 UTC
Fixed in master
Comment 2 Jon Hanna 2013-12-12 22:18:51 UTC
Created attachment 5653 [details]
Patch to fix ConcurrentDictionary not being serialisable.

The change made marks ConcurrentDictionary as Serializable, but its internal SplitOrderedList will throw it and someone currently getting the exception reported here will just be left with a SerializationException about that instead. This would be true again of the Node class within that.

Marking that Node class as Serializable would also be problematic, as the object-graph involved in the singly-linked structures would lead to recursion which would be slow and cause a large serialiserialisation stream, along with slowing our race to get to the end of the data to serialise while other threads are writing to it. (It would't be hard to have it never finish).

The attached patch has this class implement ISerializable, and then explicitly handle the serialisation. While it doesn't guarantee not to write wasted duplicate keys due to concurrent writes, such a case would still mean that the result was a reasonable view on the dictionary at a point during the serialisation, as observed by the serialising thread. Similarly it will give up on trying to serialise everything at a certain point, but that will also clearly mean we're covering all of a point in time as observed by the serialising thread.

I should hope it compares well to the MS approach, in that this approach does not lock serialisers into using the same snapshot until there's a point with no serialisation happening.

The one concern is that if there's another use of SplitOrderedList that doesn't produce the key by just casting the hashcode to uint, then this won't work correctly with that use, so some way of indicating how it should deal with that will be necessary.
Comment 3 Jon Hanna 2013-12-12 22:21:17 UTC
(Of course, even with the patch, if someone creates a ConcurrentDictionary with keys, values or equality-comparers that aren't serialisable, then it'll still throw, but that goes for any such collection).
Comment 4 Miguel de Icaza [MSFT] 2014-02-12 14:53:10 UTC

Could you review this patch and compare the serialized output in .NET vs us?

Is this an area where we can remain compatible?

There is also a related bug: https://bugzilla.xamarin.com/show_bug.cgi?id=16768
Comment 5 Jon Hanna 2014-02-14 06:03:29 UTC
AFAICT, this shouldn't cause a compatibility issue, because it works by altering SplitOrderedList, which doesn't even exist in in the .NET implementation. As seen from the outside, the two should be compatible, after this patch is applied. (*Should* being the operative word, as I'm not so arrogant as to assume I haven't made a mistake).

There is one difference in observed behaviour, but in that difference I would say .NET has a more negative effect upon concurrency. (I was in error in my description of it above, I was thinking of the Cliff Click's implementation of a concurrent dictionary in Java) I'll describe this below:

To reiterate some obvious stuff, just to be clear what I'm talking about, with a class like this we are by definition interested in handling racing threads, and hence we can't talk about "the state" of an object at the time an operation starts and finishes, as we can with single-thread uses when considering correctness, but rather that if a thread starts calls a method at time T0 and it returns at T1, then the state observed by that thread will be one that depends on all operations it could observe (most obviously those it did itself) before T0, and perhaps none, some, or all of the operations done by other threads between T0 and T1. It would further be an ideal (though not all approaches to concurrency insist upon it) if those changes were ordered (i.e. if another thread had carried out operations A, B & C in that time, then the observing thread should either see the effects of none, of A, of A & B or of all three, but not e.g. the effects of C without the effects of A & B). However, as said, not all concurrent approaches guarantee this.

With the approach taken in this patch, this should be the case: It starts creating a List<KeyValuePair<TKey, T> of all the current live values. Of the updates that happen concurrently, they may or may not be captured by this. This could lead to such a race happening that the list is never going to be filled, but this condition is detected, and at such a point the list does contain values corresponding to the state of the collection at some point between T0 and T1.

This list is then serialised, and upon deserialisation the fact that we may have stored more than one value for the same key is dealt with—we overwrite as we go, so the resulting deserialised dictionary contains the most recent value seen during the time of serialisation.

The .NET approach is simpler, but potentially fraught: I locks the entire collection, creates an array to be serialised, and then unlocks the collection. The one potentially fraught point is if some serialising code obtained a reference to this array twice, it could do so in a race with another serialisation. This should be unlikely (why would someone do that?) but isn't guarded against.

Still, it remains a viable approach instead of mine. It should beat the approach of my patch for pure speed of that one method and for serialisation size in a small number of cases, while my patch should beat it for impact upon concurrent operations all the time (because it has no impact on them, while the simpler approach blocks them all). The difference really goes down to what is obvious in different cases; because I've written a thread-safe dictionary that uses no locking, a similar approach was the obvious approach for me in the past—blocking concurrent writes being impossible—and I applied the same logic to this patch, while considering only a striped-locked concurrent dictionary, blocking all updates is the simpler thing to do.
Comment 6 Marek Safar 2014-11-25 05:00:19 UTC
Resolved in master