Bug 31900

Summary: char.ToUpperInvariant fails to convert some characters to upper case and returns different values than in .NET
Product: [Mono] Class Libraries Reporter: Don Cross <cosinekitty>
Component: SystemAssignee: Marek Safar <masafa>
Severity: normal CC: cosinekitty, masafa, mono-bugs+mono
Priority: ---    
Version: 4.0.0   
Target Milestone: Untriaged   
Hardware: PC   
OS: Linux   
Tags: Is this bug a regression?: ---
Last known good build:
Attachments: C# program that shows inconsistent char.ToUpperInvariant behavior

Description Don Cross 2015-07-13 19:48:00 UTC
Created attachment 12008 [details]
C# program that shows inconsistent char.ToUpperInvariant behavior

Attached file Program.cs shows how System.Char.ToUpperInvariant() in Mono returns different values for many non-English characters than in Windows.  Many characters that should be converted to upper case, and are correctly converted when running on Windows, are left lower case on Mono.  Examples:

On Windows:
0x0180 ƀ ==> 0x0243 Ƀ
0x01f9 ǹ ==> 0x01f8 Ǹ
0x0271 ɱ ==> 0x2c6e Ɱ

On Mono:
0x0180 ƀ ==> 0x0180 ƀ
0x01f9 ǹ ==> 0x01f9 ǹ
0x0271 ɱ ==> 0x0271 ɱ

Possibly related bug:  https://bugzilla.xamarin.com/show_bug.cgi?id=17311
(Might explain why case-insensitive comparison is not working.)

To reproduce this issue, and to see the complete list of discrepancies, use this procedure:
- Compile and run the attached Program.cs on Mono using mcs.
- Run the program.  It will generate mono_collation.h
- Compile the program on Windows using Microsoft compiler.
- Run the program.  It will generate dotnet_collation.h
- diff mono_collation.h dotnet_collation.h

My system information:

don@spearmint:~/bugmono$ uname -a
Linux spearmint 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux
don@spearmint:~/bugmono$ mono --version
Mono JIT compiler version 4.0.2 (Stable Wed Jun 24 10:04:37 UTC 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen
don@spearmint:~/bugmono$ mcs --version
Mono C# compiler version
Comment 1 Marek Safar 2015-07-23 09:07:28 UTC
Fixed in master