Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows:Have native CommandLine in Process #998

Merged
merged 1 commit into from
Feb 7, 2019

Conversation

lowenna
Copy link
Contributor

@lowenna lowenna commented Jan 23, 2019

Signed-off-by: John Howard jhoward@microsoft.com

Windows natively doesn't launch processes through an array of arguments. Instead, it uses a command line single string, which the startup code (eg https://docs.microsoft.com/en-us/previous-versions/ms880421(v=msdn.10)) interprets into an argv set. Golang does identical parsing of the command-line on Windows in commandLineToArgv: https://github.com/golang/go/blob/ff7b245a31394b700a252fd547cf16ad0ad838b6/src/os/exec_windows.go#L100-L174

See the MSDN CreateProcess documentation for the actual API into Windows: https://docs.microsoft.com/en-us/windows/desktop/api/processthreadsapi/nf-processthreadsapi-createprocessa

Using args alone on Windows has possible loss of fidelity causing some commands to not operate as expected. (This has long been the case on Windows in docker for example). A contrived (but real) example in a test case in the moby repo where fidelity is lost is the following command-line:

cmd /S /C mkdir "c:/foo"

When parsed into a JSON array (such as the args), this becomes 5 elements as follows:

  • cmd
  • /S
  • /C
  • mkdir
  • c:/foo

Note specifically that the lost information are the double-quotes around c:/foo. When using the required contatenation/space separation, and argument escaping required on Windows (https://github.com/golang/sys/blob/c4afb3effaa53fd9a06ca61262dc7ce8df4c081b/windows/exec_windows.go#L9-L18), this becomes the following command line:

cmd /S /C mkdir c:/foo

When the double-quotes are missing, mkdir here fails, but with the double-quotes succeeds as expected, as shown in the following screenshot.

image

The addition of a full commandLine in Process for use on Windows alleviates issues where fidelity can be lost.

If commandLine is omitted, the Windows runtime will fall-back to existing behaviour of contatenation/escaping described previously.

@crosbymichael PTAL

@dmcgowan & @jterry75 FYI.

@jterry75
Copy link
Contributor

@thaJeztah
Copy link
Member

When parsed into a JSON array (such as the args), this becomes 5 elements as follows:

Would the solution be to not ignore quotes when converting to JSON? i.e.

["cmd", "/S", "/C", "mkdir", "\"c:/foo\""]

@lowenna
Copy link
Contributor Author

lowenna commented Jan 23, 2019

Would the solution be to not ignore quotes when converting to JSON? i.e.

@thaJeztah No, that still won't work (argv escaping)

@lowenna
Copy link
Contributor Author

lowenna commented Jan 23, 2019

@jterry75 Updated

@lowenna lowenna force-pushed the jjh/commandline branch 3 times, most recently from eb5e1fb to 6a6ef8e Compare January 23, 2019 23:18
@lowenna
Copy link
Contributor Author

lowenna commented Jan 23, 2019

Fixed CI issues. Ready for review.

@crosbymichael
Copy link
Member

crosbymichael commented Jan 24, 2019

LGTM

Approved with PullApprove

@jterry75
Copy link
Contributor

@jhowardmsft - Might be worth adding a few unit tests to https://github.com/opencontainers/runtime-tools once this merges with the different Windows Cmdline vs Args and to update the default generator to Cmdline and not Args

@@ -50,13 +50,15 @@
"process": {
"type": "object",
"required": [
"cwd",
"args"
"cwd"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crosbymichael - Is there a way to do required: oneOf{ "args", "commandLine" } or something here?

Copy link
Contributor

@jterry75 jterry75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lowenna
Copy link
Contributor Author

lowenna commented Jan 25, 2019

Hold fire merging this until I have verified the whole end-to-end scenario

@lowenna
Copy link
Contributor Author

lowenna commented Feb 1, 2019

OK, end-to-end has been validated. This is good to go if another maintainer can review please. Thanks.

@lowenna
Copy link
Contributor Author

lowenna commented Feb 1, 2019

@tianon Any chance you could look at this quickly?

@tianon
Copy link
Member

tianon commented Feb 2, 2019

This really makes no sense to me -- don't Windows applications still have argc and argv? Don't they have explicit argument lists?

In rough Go pseudocode, I would imagine something like RUN mkdir "foo" to turn into exec.Command(`cmd`, `/S`, `/C`, `mkdir "foo"`) (using backtick-style strings to illustrate simpler), which should pass the entire mkdir "foo" string as-is to cmd.exe for further quote parsing, right? Why doesn't Windows have a way to execute a process directly without Go having to do insane reverse quoting just to get it to do the "right" thing and launch a process with a given arbitrary argv?

@lowenna
Copy link
Contributor Author

lowenna commented Feb 2, 2019

@tianon I guess it doesn't when having a Linux background ;) If you look at the CreateProcess win32 API (which is what actually gets called from HCS), it operates entirely through a commandline which is subsequently parsed by the C/C++/go/ startup runtime into a set of argv parameters that are universally understood. It's just unfortunate history as to the way Windows works. You can see some of this 'insanity' in the golang startup code at https://github.com/golang/go/blob/ff7b245a31394b700a252fd547cf16ad0ad838b6/src/os/exec_windows.go#L100 where it translates the actual command line provided from CreateProcess into an argv array.

@lowenna
Copy link
Contributor Author

lowenna commented Feb 2, 2019

Why doesn't Windows have a way to execute a process directly without

Don't shoot the messenger. I'm just saying what we have to work with here 😇

@tianon
Copy link
Member

tianon commented Feb 2, 2019 via email

@lowenna
Copy link
Contributor Author

lowenna commented Feb 2, 2019

I forgot. Sorry 😇

Unfortunately legacy design quirk is reality/how Windows still works today. If I had a time machine things might be different.....

image

Perhaps if you see this commit moby/moby@3d1c46d (in flux, although capacitor free, it's still being iterated on, but look at moby/moby#38541 for latest), it will start to make more sense. It's absolutely impossible to get moby to use containerd for its runtime, while preserving existing dockerfile compatibility, without moving to what is natural on Windows for launching process which is a full command line, not a set of argv which need escaping prior to building a command line. Consistency and standardisation are all well and good when there is parity between platforms. This is not the case though here.

@lowenna
Copy link
Contributor Author

lowenna commented Feb 2, 2019

@tianon See my message on slack too.

@lowenna
Copy link
Contributor Author

lowenna commented Feb 2, 2019

Perhaps a real example where fidelity is lost might help.

package main

import (
	"fmt"
	"strings"

	"golang.org/x/sys/windows"
)

// Demonstrate why converting a command line into an array of
// args, then trying to rebuild the command line loses fidelity.
// In other words, showing why Windows needs a full command line
// without any other conversion.

func main() {
	cl := `cmd /s /c mkdir "c:/foo"`
	
	fmt.Printf("Original: '%s'\n", cl)

	args := commandLineToArgv(cl)
	for i, v := range args {
		fmt.Printf(" - %d: %s\n", i, v)
	}

	convertedBackToCommandLine := ""
	for _, v := range args {
		convertedBackToCommandLine += windows.EscapeArg(v) + " "
	}
	convertedBackToCommandLine = strings.TrimSpace(convertedBackToCommandLine)
	fmt.Printf("converted back: '%s'\n", convertedBackToCommandLine)

	if convertedBackToCommandLine != cl {
		fmt.Println("**** FIDELITY HAS BEEN LOST ****")
	}
	
}

// Following are copied directly from golang in the startup code.
// Same as for C/C++/Other languages runtime startup to convert
// command line to argv.

// appendBSBytes appends n '\\' bytes to b and returns the resulting slice.
func appendBSBytes(b []byte, n int) []byte {
	for ; n > 0; n-- {
		b = append(b, '\\')
	}
	return b
}

// readNextArg splits command line string cmd into next
// argument and command line remainder.
func readNextArg(cmd string) (arg []byte, rest string) {
	var b []byte
	var inquote bool
	var nslash int
	for ; len(cmd) > 0; cmd = cmd[1:] {
		c := cmd[0]
		switch c {
		case ' ', '\t':
			if !inquote {
				return appendBSBytes(b, nslash), cmd[1:]
			}
		case '"':
			b = appendBSBytes(b, nslash/2)
			if nslash%2 == 0 {
				// use "Prior to 2008" rule from
				// http://daviddeley.com/autohotkey/parameters/parameters.htm
				// section 5.2 to deal with double double quotes
				if inquote && len(cmd) > 1 && cmd[1] == '"' {
					b = append(b, c)
					cmd = cmd[1:]
				}
				inquote = !inquote
			} else {
				b = append(b, c)
			}
			nslash = 0
			continue
		case '\\':
			nslash++
			continue
		}
		b = appendBSBytes(b, nslash)
		nslash = 0
		b = append(b, c)
	}
	return appendBSBytes(b, nslash), ""
}

// commandLineToArgv splits a command line into individual argument
// strings, following the Windows conventions documented
// at http://daviddeley.com/autohotkey/parameters/parameters.htm#WINARGV
func commandLineToArgv(cmd string) []string {
	var args []string
	for len(cmd) > 0 {
		if cmd[0] == ' ' || cmd[0] == '\t' {
			cmd = cmd[1:]
			continue
		}
		var arg []byte
		arg, cmd = readNextArg(cmd)
		args = append(args, string(arg))
	}
	return args
}

Running this:

S E:\docker\test\cltoargv> go build; .\cltoargv.exe
Original: 'cmd /s /c mkdir "c:/foo"'
 - 0: cmd
 - 1: /s
 - 2: /c
 - 3: mkdir
 - 4: c:/foo
converted back: 'cmd /s /c mkdir c:/foo'
**** FIDELITY HAS BEEN LOST ****
PS E:\docker\test\cltoargv>

And showing how the double quotes are required in the real world:

image

@tianon
Copy link
Member

tianon commented Feb 6, 2019

Ah, so if I'm understanding correctly, cmd.exe is a Windows application which doesn't use int main(...), but instead uses WinMain (https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-winmain), and parses lpCmdLine directly? (that makes a lot more sense and makes me more positive to this change)

@tianon
Copy link
Member

tianon commented Feb 6, 2019

LGTM

Approved with PullApprove

@crosbymichael
Copy link
Member

@opencontainers/runtime-spec-maintainers Any other comments or questions before I merge this?

@vbatts
Copy link
Member

vbatts commented Feb 7, 2019

I get why the argv array can be lossy, it's a shame we don't have a generic fix rather than a platform specific thing. But so it goes

@lowenna
Copy link
Contributor Author

lowenna commented Feb 7, 2019

I said this offline to @tianon, but for completeness, adding it here too:

You essentially have it right, but there's a tiny bit of clarification. Indeed what you say is the case for a GUI app using standard Windows entrypoints. However, for all command line apps using the standard C or C++ runtime, even go programs, they too get invoked by Windows as a command line. In fact, ALL programs on Windows gets invoked by a command line. Not by an array of arguments.

So you would still have a C-style int main(int argc, char* argv) from a coders perspective writing an application on Windows. But the C/C++/Go/ startup code will do additional processing prior to the applications main function being called. That is to convert from a command line to the argv style.

In go, that's the commandLineToArgv function which is called in the init() function of the os package https://github.com/golang/go/blob/ff7b245a31394b700a252fd547cf16ad0ad838b6/src/os/exec_windows.go#L100, so that in the actual programs func main() it can use os.Args to query the golang equivalent of argv[].

In the Microsoft C++ startup code, it does exactly the same processing as documented here prior to invoking main itself: https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017. The processing it describes is documented https://msdn.microsoft.com/en-us/library/windows/desktop/ms683156(v=vs.85).aspx and https://docs.microsoft.com/en-us/windows/desktop/api/shellapi/nf-shellapi-commandlinetoargvw

@thaJeztah
Copy link
Member

I see a lot of background is captured in the discussion here on GitHub; should a summary be created and added as commit message?

@lowenna
Copy link
Contributor Author

lowenna commented Feb 7, 2019

I see a lot of background is captured in the discussion here on GitHub; should a summary be created and added as commit message?

Sure, I'll add that.

Signed-off-by: John Howard <jhoward@microsoft.com>

This adds a new field `CommandLine` in the `Process` structure for use
on Windows. A Windows runtime will check this first and use it as-is when
launching WCOW processes in a container. If omitted, the Windows runtime
will fall back to the previous behaviour of escaping (eg Windows.EscapeArg
in golang) each of the `args` array elements, and space-concatenating them.

The reason for this change is to avoid loss of fidelity for launching
processes. One such example is the following:

`cmd /S /C mkdir "c:/foo"`

When parsed into a JSON array such as `Args`, it becomes 5 elements

 - cmd
 - /S
 - /C
 - mkdir
 - c:/foo

Here, note the lost information being the double-quotes around `c:/foo`.
When using the required contenation, space separation required on Windows,
(https://github.com/golang/sys/blob/c4afb3effaa53fd9a06ca61262dc7ce8df4c081b/windows/exec_windows.go#L9-L18),
this becomes the following command line:

`cmd /S /C mkdir c:/foo`

When the double-quotes are missing, mkdir would fail, but with the
double-quotes, it succeeds as expected:

```
C:\>cmd /s /c mkdir c:/foo
The syntax of the command is incorrect.

C:\>cmd /s /c mkdir "c:/foo"

C:\>
```

The addition of a full `commandLine` in Process for use on Windows alleviates
issues where fidelity can be lost.

Some further background:

For historical reasons, Windows only has native support for launching
processes using a command line. It does not support an argument array as
per Linux. See the `CreateProcess` API documentation on MSDN.

What happens under the covers is that prior to invoking a programs
main, the language runtime will convert the command line to a set of argv[]
suach as in the C-style `int main(int argc, char* argv), or the golang
`os.Args` prior to the programs `main` being invoked, using Windows-
specific rules.

In go, that's the `commandLineToArgv` function which is called in the init()
function of the os package https://github.com/golang/go/blob/ff7b245a31394b700a252fd547cf16ad0ad838b6/src/os/exec_windows.go#L100,

In the Microsoft C++ startup code, it does exactly the same processing
as documented here prior to invoking main itself:
https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017.
The processing it describes is documented at
https://msdn.microsoft.com/en-us/library/windows/desktop/ms683156(v=vs.85).aspx and
https://docs.microsoft.com/en-us/windows/desktop/api/shellapi/nf-shellapi-commandlinetoargvw

Some related links which provide a lot more information about the
very specific (and unique to Windows) command line escaping rules,
and handling of them are below:

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
https://stackoverflow.com/questions/31838469/how-do-i-convert-argv-to-lpcommandline-parameter-of-createprocess
https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017
@lowenna
Copy link
Contributor Author

lowenna commented Feb 7, 2019

@thaJeztah @tianon @crosbymichael

Have updated the commit message to capture the discussion above. Unfortunately, that means that the approvals have been lost due to an updated commit ID.

@vbatts
Copy link
Member

vbatts commented Feb 7, 2019

LGTM

Approved with PullApprove

@caniszczyk
Copy link
Contributor

caniszczyk commented Feb 7, 2019

LGTM

(test to see if this works now)

Approved with PullApprove

@caniszczyk
Copy link
Contributor

ok, we are unblocked

@crosbymichael crosbymichael merged commit 29686db into opencontainers:master Feb 7, 2019
@thaJeztah
Copy link
Member

Thanks for the 😍 commit message, @jhowardmsft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants