Bug 60964 - Linux build fails when python is python3 and using non-8-bit locale
Summary: Linux build fails when python is python3 and using non-8-bit locale
Alias: None
Product: Runtime
Classification: Mono
Component: Build ()
Version: master
Hardware: PC Linux
: --- normal
Target Milestone: ---
Assignee: Zoltan Varga
Depends on:
Reported: 2017-12-04 01:26 UTC by Anssi Mäkinen
Modified: 2017-12-14 09:33 UTC (History)
4 users (show)

Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Anssi Mäkinen 2017-12-04 01:26:36 UTC
Breaking commit https://github.com/mono/mono/commit/c8bf8c287a5b39709b353aa138d6a991d7d2d663

When using an UTF-8 locale on a distro where python is a symlink to python3, genmdesc.py generates in correct output for dyn_call. This causes build to crash when building ilasm.exe. I get large amount of
    Instruction metadata for ... inconsistent
followed by
    wrong maximal instruction length of instruction move (expected 0, got 3)
    * Assertion: should not be reached at mini-amd64.c:6497
and a SIGABRT in native code.

Original perl version of genmdesc as well as genmdesc.py run with python2 generate the following line:
    "\x0" "ii\x0" "\xc0" "c"        /* dyn_call */

genmdesc.py with python3 and LANG=en_US.UTF-8 generates the following line
    "\x0" "ii\x0" "Àc"      /* dyn_call */
where À is encoded 0xc3 0x80, and build fails later as explained above.

genmdesc.py with python3 and LANG=C throws 
    UnicodeEncodeError: 'ascii' codec can't encode character '\xc0' in position 0: ordinal not in range(128)
on line 162, which aborts the build.

genmdesc.py with python3 and LANG=en_US.ISO-8859-1 generates the following line
    "\x0" "ii\x0" "�c"      /* dyn_call */
where the invalid character is encoded 0xc0, which allows the build to succeed.
Comment 1 Zoltan Varga 2017-12-13 22:31:45 UTC
Will be fixed by: