Bug 45893 - I18N: EncoderFallbackBuffer.Fallback gets next character instead of the current one.
Summary: I18N: EncoderFallbackBuffer.Fallback gets next character instead of the curre...
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: mscorlib (show other bugs)
Version: master
Hardware: PC Windows
: --- normal
Target Milestone: Untriaged
Assignee: Alexander Köplinger [MSFT]
URL:
Depends on:
Blocks:
 
Reported: 2016-10-25 16:30 UTC by Oleg
Modified: 2017-10-17 12:58 UTC (History)
3 users (show)

See Also:
Tags: bugpool
Is this bug a regression?: ---
Last known good build:


Attachments

Description Oleg 2016-10-25 16:30:44 UTC
EncoderFallbackBuffer.Fallback method gets next character instead of the current one.

Code to reproduce:
    private static void Reproduce()
    {
        var encoding = Encoding.GetEncoding(1251, new MyEncoderFallback(), DecoderFallback.ExceptionFallback);
        encoding.GetBytes("1ә2");
    }

    private class MyEncoderFallback : EncoderFallback
    {
        public override EncoderFallbackBuffer CreateFallbackBuffer()
        {
            return new MyEncoderFallbackBuffer();
        }

        public override int MaxCharCount => 1;
    }

    private class MyEncoderFallbackBuffer : EncoderFallbackBuffer
    {
        public override int Remaining => 0;

        // charUnknown will be '2' here
        public override bool Fallback(char charUnknown, int index)
        {
            return true;
        }

        public override bool Fallback(char charUnknownHigh, char charUnknownLow, int index)
        {
            return true;
        }

        public override char GetNextChar()
        {
            return '\0';
        }

        public override bool MovePrevious()
        {
            return false;
        }
    }

Fallback will get '2' instead of 'ә'

For example, CP1251.cs (other encodings are also affected by this):
...
	public unsafe override int GetBytesImpl (char* chars, int charCount,
	                                         byte* bytes, int byteCount)
...
			ch = (int)(chars[charIndex]);
			charIndex++;
			charCount--;
...
						HandleFallback (ref buffer, chars, ref charIndex, ref charCount, bytes, ref byteIndex, ref byteCount);
...

charIndex is pointing to next character after ch before HandleFallback is called.

Works correctly on Windows .NET
Comment 1 Marek Safar 2017-09-08 23:25:18 UTC
Full repro


using System;
using System.Text;

class X
{
	public static void Main ()
	{
		var encoding = Encoding.GetEncoding (1251, new MyEncoderFallback (), DecoderFallback.ExceptionFallback);
		encoding.GetBytes ("1ә2");
	}
}

class MyEncoderFallback : EncoderFallback
{
	public override EncoderFallbackBuffer CreateFallbackBuffer ()
	{
		return new MyEncoderFallbackBuffer ();
	}

	public override int MaxCharCount => 1;
}

class MyEncoderFallbackBuffer : EncoderFallbackBuffer
{
	public override int Remaining => 0;

	// charUnknown will be '2' here
	public override bool Fallback (char charUnknown, int index)
	{
		Console.WriteLine (charUnknown);
		return true;
	}

	public override bool Fallback (char charUnknownHigh, char charUnknownLow, int index)
	{
		return true;
	}

	public override char GetNextChar ()
	{
		return '\0';
	}

	public override bool MovePrevious ()
	{
		return false;
	}
}
Comment 2 Alexander Köplinger [MSFT] 2017-10-16 20:41:23 UTC
This will be fixed with https://github.com/mono/mono/pull/5792, thanks!
Comment 3 Alexander Köplinger [MSFT] 2017-10-17 12:58:41 UTC
PR was merged.

Note You need to log in before you can comment on or make changes to this bug.