Bug 10692 - Buggy implementation of Decoder.Convert() method
Summary: Buggy implementation of Decoder.Convert() method
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: mscorlib (show other bugs)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: Untriaged
Assignee: marcos.henrich
URL:
Depends on:
Blocks:
 
Reported: 2013-02-26 12:28 UTC by Gerardo García Peña
Modified: 2016-02-15 17:49 UTC (History)
2 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
a test case with the different behaviour seen in Mono and MS.NET (1.99 KB, application/x-zip-compressed)
2013-02-26 12:29 UTC, Gerardo García Peña
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Gerardo García Peña 2013-02-26 12:28:10 UTC
The Mono implementation of Decoder.Convert() method differs from the Microsoft .NET implementation and of its expected behaviour.

If we read the following byte stream with the microsoft implementation:

  0x20 0xC3 0xA1 0xC3 0xA9

  We read
    - byte 0x20 and character ' '
    - bytes 0xC3 0xA1 and character 'á'
    - bytes 0xC3 0xA9 and character 'é'

But with the same code (or binary) using the mono class libraries we obtain the following behaviour:

  We read
    - byte 0x20 0xC3 and get the char ' ' (?)
    - bytes 0xA1 0xC3 and we get tha char 'á' (???)
    - byte 0xA9 and now we got 'é' (??WHAT??)

It seems that the Mono implementation of Convert is broken twice:

  - it does not honour the 'flush()' flag -- which makes possible to decode the shown byte stream in this way
  - and consumes more bytes of needed when is decoding a one-byte utf8 char.

I attach a simple test with the output of the Microsoft .NET execution (msnet) and the output of the same binary running under Mono. The binary is compiled with the 'mono-csc' compiler.
Comment 1 Gerardo García Peña 2013-02-26 12:29:34 UTC
Created attachment 3482 [details]
a test case with the different behaviour seen in Mono and MS.NET
Comment 2 marcos.henrich 2016-02-15 17:49:02 UTC
I ran the use case with mono 4.2, and the output was now identical to the one generated in .NET.