Bug 7359 - Clipboard.GetText() does not properly retrieve certain Unicode strings
Summary: Clipboard.GetText() does not properly retrieve certain Unicode strings
Alias: None
Product: Class Libraries
Classification: Mono
Component: Windows.Forms ()
Version: 2.10.x
Hardware: Other Linux
: Lowest normal
Target Milestone: Untriaged
Assignee: Bugzilla
Depends on:
Reported: 2012-09-20 16:21 UTC by expebition
Modified: 2017-02-10 11:09 UTC (History)
3 users (show)

Tags: mono-community
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description expebition 2012-09-20 16:21:25 UTC
Description of Problem:
There are two categories of problems here.  First, GetText() cannot read Unicode characters with a code point between 127 and 255. e.g "abc©" becomes "".  Second, GetText() un-escapes literal \u sequences, e.g. "\u2714✔" becomes "✔✔".

Steps to reproduce the problem:
1. Copy problematic text ("abc©" or "\u2714✔") from another application such as Gedit or Firefox
2. Paste it into an application running under Mono on Linux

Actual Results:
See description

Expected Results:
Contents of the clipboard

How often does this happen? 
Every time

Additional Information:
TranslatePropertyToClipboard() never receives a property of UTF16_STRING or UTF8_STRING.  XA_STRING receives ambiguous \u escapes which UnescapeUnicodeFromAnsi() is unable to distinguish.  OEMTEXT receives ISO-2022-escaped UTF8 strings which need to be decoded, and can't be requested anyhow via GetText().

Slightly related:  https://bugzilla.novell.com/show_bug.cgi?id=596402

Both issues can be fixed with a simple patch:

--- a/mcs/class/Managed.Windows.Forms/System.Windows.Forms/XplatUIX11.cs	Thu Sep 20 10:03:05 2012 -0400
+++ b/mcs/class/Managed.Windows.Forms/System.Windows.Forms/XplatUIX11.cs	Thu Sep 20 16:22:45 2012 +0000
@@ -2730,7 +2730,7 @@
 			//else if (format == "PenData" ) return 10;
 			//else if (format == "RiffAudio" ) return 11;
 			//else if (format == "WaveAudio" ) return 12;
-			else if (format == "UnicodeText" ) return UTF16_STRING.ToInt32();
+			else if (format == "UnicodeText" ) return UTF8_STRING.ToInt32();
 			//else if (format == "EnhancedMetafile" ) return 14;
 			//else if (format == "FileDrop" ) return 15;
 			//else if (format == "Locale" ) return 16;
Comment 1 eb1 2017-02-09 16:37:11 UTC
I don't get the exact described problems with Mono 3.x, and pasting works when pasting text from gedit. However, I get a similar problem trying to paste text with Arabic script from LibreOffice.

The proposed patch fixes it for Arabic as well.
Comment 2 Alexander Köplinger [MSFT] 2017-02-10 11:09:14 UTC
Fixed by https://github.com/mono/mono/pull/4358, thank you!