Bug 5363 - ManagedCollation problems with certain unicode characters
Summary: ManagedCollation problems with certain unicode characters
Alias: None
Product: Class Libraries
Classification: Mono
Component: mscorlib ()
Version: 2.10.x
Hardware: PC Linux
: --- normal
Target Milestone: Future Release
Assignee: Bugzilla
Depends on:
Reported: 2012-05-27 19:47 UTC by Vidar
Modified: 2017-10-12 13:35 UTC (History)
3 users (show)

Is this bug a regression?: ---
Last known good build:

Short program demonstrating bug. (10.56 KB, application/x-gzip)
2012-05-27 19:47 UTC, Vidar
Entire output when program crashes (1.39 KB, text/plain)
2012-05-27 19:51 UTC, Vidar
Entire output when program runs as intended (4.18 KB, text/plain)
2012-05-27 19:55 UTC, Vidar

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report for Bug 5363 on GitHub or Developer Community if you have new information to add and do not yet see a matching new report.

If the latest results still closely match this report, you can use the original description:

  • Export the original title and description: GitHub Markdown or Developer Community HTML
  • Copy the title and description into the new report. Adjust them to be up-to-date if needed.
  • Add your new information.

In special cases on GitHub you might also want the comments: GitHub Markdown with public comments

Related Links:

Description Vidar 2012-05-27 19:47:56 UTC
Created attachment 1971 [details]
Short program demonstrating bug.

Ubuntu 12.04, mono/gmcs, also with older versions of Ubuntu and mono/gmcs back to 8.10 and 1, respectively (see https://bugzilla.novell.com/show_bug.cgi?id=485888)

The program in the test case (see attachment) adds words from a text file as keys in a SortedList, with an integer as value. When running a foreach loop to print out the values from the list, a KeyNotFoundException is thrown.

Reproducible: Always

Steps to Reproduce:
1. Put both files from the attachment in the same dir
2. Compile the test program (gmcs test.cs)
3. Run it (mono test.exe)

Actual Results:  
Program prints some of the values in the SortedList, one per line, then crashes, throwing a KeyNotFoundException

Expected Results:  
Program should print all 1312 integer values (as counted by "wc -l") in SortedList and exit without crashing

Environment variable LANG currently set to "en_DK.utf8". Also tried setting it to "en_US.utf8" and "dk_DK.utf8" without success, however setting it to "nb_NO.utf8" or "nn_NO.utf8" allows the program to run without crashing.

Dispensing with the input file and entering only the key that causes the crash into the source code, e.g "words.Add("fåe", 5)", also works.

(Using Windows, the program runs without issues in VS2008, newer versions have not been tried.)
Comment 1 Vidar 2012-05-27 19:51:20 UTC
Created attachment 1974 [details]
Entire output when program crashes
Comment 2 Vidar 2012-05-27 19:55:07 UTC
Created attachment 1975 [details]
Entire output when program runs as intended
Comment 3 Zoltan Varga 2012-05-29 07:53:45 UTC
This seems like a string collation problem:

using System;
using System.Collections.Generic;
using System.IO;

public class Tests
	public static void Main (String[] args) {
		SortedList<String, int> words = new SortedList<String, int>();

		string s1 = "fær";
		string s2 = "fåe";

		Console.WriteLine (s1.CompareTo (s2));
		Console.WriteLine (s2.CompareTo (s1));
		words.Add (s1, 0);
		words.Add (s2, 0);
		string last_w = null;
		foreach (string w in words.Keys) {
			if (last_w != null) {
				Console.WriteLine (last_w.CompareTo (w));
				if (last_w.CompareTo (w) >= 0)
					throw new Exception (w);
			last_w = w;

Notice that the comparison of s1 and s2 returns 1 both ways, which doesn't seem right. This confuses SortedList.
Comment 4 Rodrigo Kumpera 2013-01-11 16:52:18 UTC
SortedList requires you to provide monotonically comparable objects. If your collation makes string comparison to not work this way there's nothing to be done.
Comment 5 Vidar 2013-01-14 02:29:00 UTC
I just now installed the latest versions of MS .NET, MonoDevelop and Mono in Windows 7. Using MonoDevelop and MS .NET the test program supplied by Zoltan Varga above ran fine. Switching to the Mono runtime, the program crashed. Clearly something can be done, because the MS runtime produced the expected result.
Comment 6 Marek Safar 2017-10-12 13:35:48 UTC
This also works with netcore on mac