Bug 31900 - char.ToUpperInvariant fails to convert some characters to upper case and returns different values than in .NET
Summary: char.ToUpperInvariant fails to convert some characters to upper case and retu...
Alias: None
Product: Class Libraries
Classification: Mono
Component: System ()
Version: 4.0.0
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Marek Safar
Depends on:
Reported: 2015-07-13 19:48 UTC by Don Cross
Modified: 2015-07-23 09:07 UTC (History)
3 users (show)

Is this bug a regression?: ---
Last known good build:

C# program that shows inconsistent char.ToUpperInvariant behavior (1.52 KB, text/x-csharp)
2015-07-13 19:48 UTC, Don Cross

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Don Cross 2015-07-13 19:48:00 UTC
Created attachment 12008 [details]
C# program that shows inconsistent char.ToUpperInvariant behavior

Attached file Program.cs shows how System.Char.ToUpperInvariant() in Mono returns different values for many non-English characters than in Windows.  Many characters that should be converted to upper case, and are correctly converted when running on Windows, are left lower case on Mono.  Examples:

On Windows:
0x0180 ƀ ==> 0x0243 Ƀ
0x01f9 ǹ ==> 0x01f8 Ǹ
0x0271 ɱ ==> 0x2c6e Ɱ

On Mono:
0x0180 ƀ ==> 0x0180 ƀ
0x01f9 ǹ ==> 0x01f9 ǹ
0x0271 ɱ ==> 0x0271 ɱ

Possibly related bug:  https://bugzilla.xamarin.com/show_bug.cgi?id=17311
(Might explain why case-insensitive comparison is not working.)

To reproduce this issue, and to see the complete list of discrepancies, use this procedure:
- Compile and run the attached Program.cs on Mono using mcs.
- Run the program.  It will generate mono_collation.h
- Compile the program on Windows using Microsoft compiler.
- Run the program.  It will generate dotnet_collation.h
- diff mono_collation.h dotnet_collation.h

My system information:

don@spearmint:~/bugmono$ uname -a
Linux spearmint 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux
don@spearmint:~/bugmono$ mono --version
Mono JIT compiler version 4.0.2 (Stable Wed Jun 24 10:04:37 UTC 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen
don@spearmint:~/bugmono$ mcs --version
Mono C# compiler version
Comment 1 Marek Safar 2015-07-23 09:07:28 UTC
Fixed in master