Bug 28321 (PC) - Encoding.GetString does not decode the text from Shift_JIS encoding correctly
Summary: Encoding.GetString does not decode the text from Shift_JIS encoding correctly
Status: NEW
Alias: PC
Product: Class Libraries
Classification: Mono
Component: General (show other bugs)
Version: 3.12.0
Hardware: PC Mac OS
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2015-03-23 05:14 UTC by Prashant Cholachagudda
Modified: 2017-11-22 08:20 UTC (History)
3 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Prashant Cholachagudda 2015-03-23 05:14:15 UTC
When we try to get the string from Shift_JIS encoded text we get incorrect value from GetString method

	var sjisEnc = Encoding.GetEncoding("Shift_JIS");
	byte[] sjisText1 =  Encoding.GetEncoding("Shift_JIS").GetBytes("×");
	string text1 = sjisEnc.GetString(sjisText1);

The text value returned as `?` instead of `x`
Comment 1 Jodie Miu 2017-11-22 08:20:57 UTC
`x` is not what GetString should be returning. GetString should be returning `×` -- multiplication sign vs the letter `x`.

Note: all of these files are located in `mcs/class/I18N/CJK`.

I followed this bug all the way up to the GetChars method in CP932.cs, where I realized the lookup in the convert.jisx0208ToUnicode table returns a 0. It shouldn't be returning a 0. This table comes from the JISConvert class, which loads the jis.table in the same directory in. There is a comment from lines 36-42 in the CP51932.cs file, stating:

    "FIXME: Some characters such as 0xFF0B (wide "plus") are missing in that table".

I also looked at the table as it was printed from one of the tools in the tools subdirectory. Indeed, it was missing an entry for 0x817E.

Note You need to log in before you can comment on or make changes to this bug.