Bug 10443 - String.StartsWith fails when the string contains character 0
Summary: String.StartsWith fails when the string contains character 0
Status: NEW
Alias: None
Product: Class Libraries
Classification: Mono
Component: mscorlib (show other bugs)
Version: master
Hardware: PC Mac OS
: --- normal
Target Milestone: Untriaged
Assignee: Atsushi Eno
URL:
: 39951 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-02-19 11:23 UTC by Adrian Gallero
Modified: 2016-04-16 07:59 UTC (History)
4 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Adrian Gallero 2013-02-19 11:23:44 UTC
This fails in mono 3.0.3; mono 2.10 works fine. Fails in both Windows and OSX, I've not tested it in linux.

Steps to reproduce:
Run the following application


using System;

namespace teststart
{
	class MainClass
	{
		public static void Main (string[] args)
		{
			string char0 = ((Char)0).ToString();
			Console.WriteLine ("Test".StartsWith(char0));
		}
	}
}

Actual Results: True
Expected Results: False

Both mono 2.10 and Visual studio return False.

It looks like it is calling some C functions to do the job, and this functions fail to handle character 0. But in C# a character 0 is a valid character of a string.
Comment 1 Marek Safar 2013-02-20 05:01:08 UTC
This looks like bug in Mono.Globalization.Unicode.MSCompatUnicodeTable.IsIgnorable which always always ignores \0 characters.
Comment 2 Adrian Gallero 2013-08-02 06:33:25 UTC
Hi,

Just checking if there is any news in this bug. Mono 3 is now out of beta, and this bug causes many serious subtle errors.
Comment 3 Atsushi Eno 2013-08-05 03:00:03 UTC
There are some changes that I am not even aware of. Whoever made the change should first take a look.
Comment 4 Atsushi Eno 2013-08-05 05:13:13 UTC
Actually I assume the culprit is my 8277f4a. It was introduced by some bug report (Novell #687444) which is contradiction to earlier report (Novell 319530). There are not any detailed analysis on \0 usage.

After all, is \0 IGNORABLE or NOT IGNORABLE?

WHY ON EARTH .NET CAN RETURN 1 for "MONO".CompareTo ("\0\0\0") and 0 for "MONO".ComapreTo ("MONO\0\0\0"), while both "MONO".IndexOf ("\0\0\0") and "MONO".IndexOf ("MONO\0\0\0") return -1 !?

Can ANYONE explain this LOGICALLY?

Code is logic. Only logical explanation can bring appropriate fix.
Comment 5 Adrian Gallero 2013-08-05 05:56:59 UTC
Hi,

Thanks for the feedback, I wasn't aware that CompareTo behaved differently from the functions I am using (like IndexOf, startsWith, etc).

Hey, we could even add "==" to the mix:
"MONO" == "MONO\0\0\0" = false 
"MONO".CompareTo ("MONO\0\0\0");  = 0

Now, it seems to be a documented thing. From the docs in CompareTo:
 http://msdn.microsoft.com/en-us/library/84787k22.aspx

Notes to Callers
Character sets include ignorable characters. The Compare(String, String) method does not consider such characters when it performs a culture-sensitive comparison. For example, a culture-sensitive comparison of "animal" with "ani-mal" (using a soft hyphen, or U+00AD) indicates that the two strings are equivalent, as the following example shows.

While there is no such remark for example in StartsWith. So I guess it is just defined differently depending on the function.


In my case, after your insights, I've updated all the code to use StringComparison.Ordinal, which does seem to fix the case.

So for me this is solved. But I can see it causing problems in general mono usage, I am not sure on what to suggest to solve it. Maybe add some deprecated warning if the user doesn't specify a culture? (note that while they aren't deprecated in .NET, the docs strongly advise against using them, but it is hard to find them in a big codebase without the help of a compiler).
Comment 6 Atsushi Eno 2013-08-05 06:32:50 UTC
It is about culture-sensitive string comparison (aka string collation). And we are aware of culture sensitive string comparison. We indeed have a lot of tests that verifies mono's string behavior that are supposed to be the same as .NET (mcs/class/corlib/Test/System.Globalization/CompareInfoTest).

Though, string operation has been changing in .NET. It is definitely on Windows and possibly on .NET Framework itself. And the "culture-sensitive" behavior is not really explained at all. We can implement any *documented* behavior, but for anything else all we can do is to implement something logically explained and logically expressible in code.
Comment 7 Marek Safar 2016-04-16 07:59:57 UTC
*** Bug 39951 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.