Bug 28321 (PC) - Encoding.GetString does not decode the text from Shift_JIS encoding correctly
Summary: Encoding.GetString does not decode the text from Shift_JIS encoding correctly
Alias: PC
Product: Class Libraries
Classification: Mono
Component: mscorlib ()
Version: 3.12.0
Hardware: PC Mac OS
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
Depends on:
Reported: 2015-03-23 05:14 UTC by Prashant Cholachagudda
Modified: 2018-03-13 15:02 UTC (History)
4 users (show)

Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Prashant Cholachagudda 2015-03-23 05:14:15 UTC
When we try to get the string from Shift_JIS encoded text we get incorrect value from GetString method

	var sjisEnc = Encoding.GetEncoding("Shift_JIS");
	byte[] sjisText1 =  Encoding.GetEncoding("Shift_JIS").GetBytes("×");
	string text1 = sjisEnc.GetString(sjisText1);

The text value returned as `?` instead of `x`
Comment 1 Jodie Miu 2017-11-22 08:20:57 UTC
`x` is not what GetString should be returning. GetString should be returning `×` -- multiplication sign vs the letter `x`.

Note: all of these files are located in `mcs/class/I18N/CJK`.

I followed this bug all the way up to the GetChars method in CP932.cs, where I realized the lookup in the convert.jisx0208ToUnicode table returns a 0. It shouldn't be returning a 0. This table comes from the JISConvert class, which loads the jis.table in the same directory in. There is a comment from lines 36-42 in the CP51932.cs file, stating:

    "FIXME: Some characters such as 0xFF0B (wide "plus") are missing in that table".

I also looked at the table as it was printed from one of the tools in the tools subdirectory. Indeed, it was missing an entry for 0x817E.
Comment 2 Marek Safar 2018-03-13 15:02:11 UTC
Fixed in master