Bug 19296 - Process.Start does not accept windows style argument strings
Summary: Process.Start does not accept windows style argument strings
Status: NEW
Alias: None
Product: Runtime
Classification: Mono
Component: io-layer (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- normal
Target Milestone: ---
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-04-25 06:52 UTC by Tim Cuthbertson
Modified: 2017-03-26 16:07 UTC (History)
3 users (show)

See Also:
Tags:
Is this bug a regression?: ---
Last known good build:


Attachments

Description Tim Cuthbertson 2014-04-25 06:52:52 UTC
I recently had to figure out how you're supposed to quote an array of strings in order to pass them as another program's arguments using Process.Start(ProcessStartInfo). The result utility is here:

https://github.com/gfxmonk/csharp-quote-argv

I wrote this for Windows, and it works correctly for every input I threw at it (even a morning of randomly generated ASCII data). At least when the invoked binary follows the MS C runtime parsing rules (which is basically everything on windows).

On mono running linux, the same code does completely different stuff - e.g it throws an exception if an arguments contains a single quote, and backslashes are treated very differently.

To reproduce, on a Linux box:

$ git clone https://github.com/gfxmonk/csharp-quote-argv.git
$ make
$ ./test.py "123'"
Unhandled Exception: System.ComponentModel.Win32Exception: ApplicationName='python', CommandLine='-c "import sys; print repr(sys.argv[1:])" 123'', CurrentDirectory=''
  at System.Diagnostics.Process.Start_noshell (System.Diagnostics.ProcessStartInfo startInfo, System.Diagnostics.Process process) [0x00000] in <filename unknown>:0 
  at System.Diagnostics.Process.Start_common (System.Diagnostics.ProcessStartInfo startInfo, System.Diagnostics.Process process) [0x00000] in <filename unknown>:0 
  at System.Diagnostics.Process.Start (System.Diagnostics.ProcessStartInfo startInfo) [0x00000] in <filename unknown>:0 
  at TestMain.Main (System.String[] initialArgs) [0x00000] in <filename unknown>:0 



(you can also `make fuzz` to throw random ASCII data at it until it fails).


I poked into process.c, and it seems to be calling g_shell_parse_argv() with the arguments string to get an array to pass to execv*():
https://github.com/mono/mono/blob/master/mono/io-layer/processes.c#L856

So on linux, this presumably accepts only a UNIX shell style string. But if the same code is to ever run on both windows and linux, it seems the only viable choice would be to instead use CommandLineToArgvW to extract an array just like windows would:

http://msdn.microsoft.com/en-us/library/windows/desktop/bb776391%28v=vs.85%29.aspx

Changing this could definitely break existing code relying on the current behaviour, but the alternative is forever making people detect and implement two different escaping schemes if they hope to run this code cross-platform.
Comment 1 Mikayla Hutchinson [MSFT] 2014-04-25 09:31:11 UTC
FWIW, the only process argument escaping that I found to work reliably across Mono and .NET was \\ and \", but that's actually all you need. See e.g. https://github.com/mono/monodevelop/blob/master/main/src/core/MonoDevelop.Core/MonoDevelop.Core.Execution/ProcessArgumentBuilder.cs
Comment 2 Tim Cuthbertson 2014-04-25 20:40:31 UTC
That code isn't correct on Windows, though. If you always escape backslashes, you end up with twice as many backslashes as you should. Using the above linked code, it fails with anything containing a backslash:

>python test.py "foo\bar"

['foo\\bar']
['foo\\\\bar']
       ^

The first line is the orignal args that I'm going to send, the second is the arguments as printed from running `python -c import sys; print repr(sys.argv[1:) <quoted-arguments>`.

If you use this scheme but instead escape backslashes only where windows requires it, you end up dropping backslashes on linux whenever you have multiple backslashes in a row. Which works in more cases, but is still wrong.
Comment 3 Mikayla Hutchinson [MSFT] 2014-04-29 18:02:23 UTC
It worked fine for me: https://gist.github.com/11413092
Comment 4 Mikayla Hutchinson [MSFT] 2014-04-29 18:22:10 UTC
Oh, actually, you're right. Strange, I could have sworn that worked on Windows before.

I guess a workaround would be code that branches based on whether it's running on Mono or .NET.
Comment 5 Mikayla Hutchinson [MSFT] 2014-04-29 18:26:18 UTC
Something like this: https://gist.github.com/mhutch/edb01e3333b7480381c4
Comment 6 Tim Cuthbertson 2014-04-29 18:57:26 UTC
> I guess a workaround would be code that branches based on whether it's running
on Mono or .NET.

Right, yes that would work. And I may add that to my code, but first I'd like to find out authoritatively whether this is _intended_ behaviour (and therefore going to stay that way), or just what mono happens to do because nobody thought too hard about it (and therefore should be fixed).

Barely anyone seems to get the quoting right on windows, so the chances of folks remembering to include similar-but-subtly-different-and-probably-untested code for mono seems vanishingly small. So it seems in mono's best interests to do what windows does here (even if the quoting scheme is somewhat nuts).
Comment 7 Mikayla Hutchinson [MSFT] 2014-04-29 19:42:07 UTC
AFAIK this was originally implemented using g_shell_parse_argv which performs Unix-style parsing of arguments (though Mono now uses its own slightly different eglib implementation instead of glib).

Windows-like argument parsing would be preferable for compatibility, but this cannot easily be changed since it would break programs that depend on the existing behaviour.
Comment 8 Mikayla Hutchinson [MSFT] 2014-04-29 19:43:07 UTC
TBH I think half the problem is that very few people understood/understand how Windows does it...
Comment 9 matthi.d 2017-03-26 15:21:22 UTC
I encountered the same problem and therefore found this bug.

While it would obviously break some programs its really a problem as it is right now. And mono always had the philosophy that different behavior is a bug.

See http://www.mono-project.com/docs/faq/technical/

> If the behavior of an API call is different, will you fix it?

> Yes, we will. But we will need your assistance for this. 
> If you find a bug in the Mono implementation, please fill a bug report in http://bugzilla.xamarin.com. 
> Do not assume we know about the problem, we might not, and using the bug tracking system helps us organize the development process.


The escaping is actually documented now in the .netcore code: https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.Process/src/System/Diagnostics/Process.Unix.cs#L443-L522
And on MSDN: https://msdn.microsoft.com/en-us/library/17w5ykft.aspx

My testcase is:

    // we want to execute /bin/bash -c "echo \"\\$(ARCH)\""
    // to output "$(ARCH)"
    // Convert via https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.Process/src/System/Diagnostics/Process.Unix.cs#L443-L522
    // Documentation: https://msdn.microsoft.com/en-us/library/17w5ykft.aspx
    let s = ProcessStartInfo(bashWindows, """ "-c" "echo \"\\$(ARCH)\"" """)
    s.RedirectStandardOutput <- true
    s.UseShellExecute <- false
    let p = Process.Start(s)
    printfn "Output: %s" (p.StandardOutput.ReadToEnd())
    p.WaitForExit()

This prints '$(ARCH)' on windows but '/bin/bash: ARCH: command not found' on Unix.

Another problem is: Where is the current behavior documented? For me it doesn't make any sense as it is right now.
Comment 10 matthi.d 2017-03-26 16:07:08 UTC
For anyone stumbling over this and searching a workaround:

1. Use the MSDN documented methods for parsing and generating command lines.
2. Only on Mono/Unix: Apply .Replace("\\$", "\\\\$").Replace("\\`", "\\\\`") to your command line immediately before passing to ProcessStartInfo or Process.Start

Note You need to log in before you can comment on or make changes to this bug.