Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the need for additional \-escaping of double quotes (") when calling external programs #2361

Closed
6 tasks done
mklement0 opened this issue Apr 26, 2018 · 14 comments · Fixed by #8053
Closed
6 tasks done
Assignees
Labels
area-native-cmds Area - native command support Pri2 Priority - Medium up-for-grabs Tag - issue is open for any contributor to resolve

Comments

@mklement0
Copy link
Contributor

mklement0 commented Apr 26, 2018

Related: #5152; once PowerShell/PowerShell#1995 gets fixed, this topic will become obsolete.

An additional layer of escaping of " chars. is unexpectedly needed when calling external programs from PowerShell to make sure that the external program sees " embedded in "..." as \", which is what PowerShell's own CLI expects too.

This is currently not documented, as far I can tell, neither in about_Parsing nor in about_Quoting_Rules nor in about_Special_Characters

Update: As an alternative to the manual \-escaping detailed below, you can use the PSv3+ ie helper function from the Native module (in PSv5+, install with Install-Module Native from the PowerShell Gallery), which internally compensates for all broken behavior and allows passing arguments as expected; to use it, simply prepend ie to your invocations; e.g.:
ie pwsh -noprofile -command ' "hi" ' works as expected.


Calling an external program from PowerShell (the external program just happens to be another instance of PowerShell (pwsh) in the following examples):

In order to pass string literal "hi" as a command to an external program from PowerShell, PowerShell's own escaping of embedded " chars. must unexpectedly be supplemented with \-escaping (the following commands are equivalent):

# !! The \-escaping of the " chars. should NOT be necessary; without it, these commands would fail.
PS> pwsh -noprofile -command " \""hi\"" " 
PS> pwsh -noprofile -command " \`"hi\`" "
PS> pwsh -noprofile -command ' \"hi\" '  
hi

As an aside: If you're really calling another PowerShell instance from inside PowerShell, using a script block ({ ... }) to pass the command avoids all quoting headaches and additionally enables support for typed results (not just strings), albeit with the same limitations on type fidelity as with background jobs / remoting.

Note: There are also a number of bugs:

  • Empty-string arguments are quietly removed from the invocation.

  • Values with embedded " are passed incorrectly: in addition to not automatically escaping them (the problem shown above), the situationally necessary enclosure in "..." behind the scenes is not reliably triggered, such as with
    3" of snow.

  • In Windows PowerShell only, values with spaces that end in a \ char. are passed incorrectly.

For details and workarounds, see PowerShell/PowerShell#1995 (comment)


Calling the PowerShell CLI from another shell - cmd.exe or bash:

Note: The difficulties discussed next stem primarily from cmd.exe limitations and how commands are invoked by a single command-line string on Window.
Calling from Bash / POSIX-like shells works robustly and as expected.


cmd.exe:

On Windows, PowerShell Core now properly recognizes as "" as escaped ", which enables robust escaping, given that "" is also recognized by cmd.exe itself:

# PowerShell *Core* only, on Windows only: use "", which works robustly.
C:\> pwsh.exe -noprofile -command " ""hi   &   dry"" " 
hi  &  dry  # OK

Sadly, "" doesn't work in Windows PowerShell, where \-escaping for PowerShell's sake is also required, which causes problems:

Using \"" (sic) doesn't require escaping of cmd.exe metachars., but doesn't preserve whitespace as-is:

# No extra escaping needed, but whitespace is normalized.
C:\> powershell.exe -noprofile -command " \""hi   &   dry\"" " 
hi & dry   # !! Two spaces were collapsed into one.

Only using \" preserves whitespace as-is, but it additionally requires individual ^-escaping of the following cmd.exe metacharacters inside \"...\" runs: & | < > ^

# Whitespace is faithfully preserved, but cmd.exe metachars. must be ^-escaped
C:\> powershell.exe -noprofile -command " \"hi   ^&   dry\" " 
hi  &  dry  # OK

bash:

Use \" inside "...." strings, and no escaping at all inside '...' strings:

$ pwsh -noprofile -command " \"hi\" " # \-escaping needed for Bash's own sake (nesting ")
$ pwsh -noprofile -command ' "hi" ' # !! NO escaping of " needed

Note that on Unix-like platforms PowerShell's own command-line parsing does not come into play (arguments are invariably passed as an array of literal tokens), and the above commands solely use Bash [non]-escaping to pass an argument with literal contents  "hi" .

Thus:

  • \" is Bash's native way to escape " inside a "..." string.
  • Inside '...', " needs no escaping
    • Caveat: if you tried \" inside '...', the \ chars. would be passed as literals, and the PowerShell command would break.

This asymmetry with how things work on Windows is unfortunate, but unavoidable.

Note that calling from PowerShell on Unix-like platforms still requires the extra \-escaping:

PS> bash -c 'echo "one two" '  # !! only prints 'one', because the " are *ignored*
PS> bash -c 'echo \"one two\" '  # OK, but the \-escaping shouldn't be necessary.

While this requirement at least makes for consistent behavior across platforms from the PowerShell side, it is certainly unexpected - and an artificial requirement - for anyone familiar with Unix shell scripting.


The bottom line is:

  • In an ideal world, PowerShell would transparently handle any additional, platform-specific escaping needs for interfacing with external programs behind the scenes, so that all that users need to take care of - on any supported platform - is to follow PowerShell's escaping rules.

  • Sadly, this is not an option if backward compatibility must be preserved.

Version(s) of document impacted

  • Impacts 6.1 document
  • Impacts 6.0 document
  • Impacts 5.1 document
  • Impacts 5.0 document
  • Impacts 4.0 document
  • Impacts 3.0 document
@BrucePay
Copy link

@mklement0 If you invoke powershell from cmd.exe (or bash or whatever) the double-quoting issues come from the calling language not powershell. There is always a problem when you invoke one language from another because both languages use quoting. Try invoking awk from bash for example. The one scenario where powershell is involved as the calling language is powershell invoking powershell and in that scenario, we avoid the quoting nightmare by supporting scriptblock notation:

powershell.exe {
    Write-Host "This is a message"
}

(though I'm willing to bet that is not documented)

@mklement0
Copy link
Contributor Author

mklement0 commented Apr 27, 2018

@BrucePay

the double-quoting issues come from the calling language not powershell.

That is also true, but - on Windows - not only.

Yes, you first need to make the calling language happy so as not to cause a syntax error there.

But, due to the anarchy that is Windows argument passing, it is ultimately up to the called program to parse the command line from a single string.

As I've stated, that problem generally doesn't arise on Unix-like platforms, where programs are invoked with an array of literals - no command-line parsing needed (but see below).

PowerShell is actively involved in transforming the command line, which is a conceptual necessity on Windows, and an unfortunate technical one on Unix-like platforms, due to current limitations in CoreFX that will be addressed in 2.1 (see below).

The way the transformation is implemented is unfortunate:

For instance:

echoArgs.exe 'Nat "King" Cole'

results in the following command line:

"C:\path\to\EchoArgs.exe" "Nat "King" Cole"

While the '...' were translated into "..." - a sensible transformation, given that most utilities on Windows do not recognize '...' as string delimiters - the embedded " were not escaped, resulting in a broken command line.

To put it differently:

PowerShell should never have put the burden of \-escaping " chars. on the user; echoArgs.exe 'Nat "King" Cole' should be translated into echoArgs.exe "Nat \"King\" Cole" behind the scenes.

While - as stated - it is ultimately up to the target program to parse the command line, enclosing arguments in "..." with embedded " chars. escaped as \" is the most widely used convention and - with the notable exception of batch files - works with external utilities of both Windows and Unix heritage.

Even PowerShell itself honors this convention when it is called from the outside: \" is treated as an escaped ", even though PowerShell-internally `, not \, is the escape character.

On Unix-like platforms, PowerShell should only ever have passed the arguments resulting after its parsing and expansion as an array of literals to the target program - the problem of escaping just for the command line would never even arise.

Unfortunately, prior to the addition of ProcessStartInfo.ArgumentList in 2.1 it couldn't actually do that, but the way it manually constructs the command line has the same flaws as on Windows.

Yes, the behavior is consistent, but it is now broken on all supported platforms.
And it sounds like that brokenness is here to stay, for fear of breaking existing scripts.

The pain of that will be felt more strongly on Unix-like platforms, where calling external utilities is much more common than on Windows.
(And the pain will also be felt when calling into PowerShell, due to how -Command parsing works, but that's a separate story.)

It is one of the real pain points that motivated me to start the conversation about "PowerShell vZeroTechnicalDebt"

In short: The need for the user to \-escape " chars. when calling external programs is far from obvious and therefore needs documenting, which is why I opened this issue.


Indeed, use of a script block - available from within PowerShell only and only when calling PowerShell - is the best way to avoid quoting headaches in that scenario.

Its use is documented in Get-Help about_PowerShell.exe / powershell -?, but (a) not with respect to quoting and (b) it's a good idea to also mention it in the help topics mentioned above.

@bergmeister
Copy link
Contributor

I should add that there are also scenarios where it is better to replace a double quote with 3 double quotes in the case of the ProcessStartInfo.Arguments property in .Net, which boils down to be caused by the Windows API that is called at the end (therefore not really PowerShell's fault). The documentation on it says:

A single argument that includes spaces must be surrounded by quotation marks, but those quotation marks are not carried through to the target application. In include quotation marks in the final parsed argument, triple-escape each mark.

@mklement0
Copy link
Contributor Author

mklement0 commented May 26, 2018

Thanks, @bergmeister, good to know, but that's a slightly different scenario in that the intent there is to pass embedded, literal double quotes - in addition to satisfying syntactic quoting requirements.

ProcessStartInfo.Arguments is itself not plagued by the need for extra escaping the way PowerShell is; that is, if - taking the escaping needs of the calling language into account - you manage to pass a "..." string as part of a string's content, it is passed through to the callee (or, in the case of Unix-like platforms, tokenized as expected; note that .NET Core 2.1 will give you the option to alternatively pass arguments as an array of literals via the new .ArgumentList property.)

The triple-" technique surrounds an embedded "..." string - whose double quotes have syntactic function from the perspective of the callee - with "" on either end, with each of the latter representing an embedded, literal " to the callee.

Unfortunately, however, the triple-" technique doesn't work on Unix-like platforms.

What works on both Windows and Unix is to use \""..."\" rather than """...""".
That is, the embedded literal quotes must be escaped as \" on either end; e.g.:

$psi = [System.Diagnostics.ProcessStartInfo]::new()
$psi.Arguments = 'one \""two in literal quotes"\"'

@sdwheeler
Copy link
Contributor

@SteveL-MSFT is this worth trying to document?

@SteveL-MSFT
Copy link
Contributor

@sdwheeler good question. I think we are ok with just adding an example where we execute a native command with one that requires literal quotes.

@mklement0
Copy link
Contributor Author

mklement0 commented Jun 18, 2018

@SteveL-MSFT:

Given how much of a pain point quoting for external programs is (including calling another instance of PowerShell), please consider a dedicated about_* topic that covers it; e.g. about_Native_Quoting_Rules (though I wonder if there's a better word than "native") - or at least add a dedicated section to about_Quoting_Rules.

As stated, with CLI-savvy Unix users starting to use PowerShell who expect friction-less calls to external utilities, the pain will increase - see PowerShell/PowerShell#6935 for a recent example.

What's needed is a systematic explanation of how quoting does not work as it should with external programs, and how to work around that - not just an example or two.

In addition to what I've discussed above, such a topic could cover additional pitfalls:

@SteveL-MSFT
Copy link
Contributor

@mklement0 Would you be willing to submit a draft of this about topic?

@mklement0
Copy link
Contributor Author

Thanks, @SteveL-MSFT, but I prefer that someone else take this on.

@2BitSalute
Copy link

I would also like to see an example of passing in JSON (from a literal string or from a file) to an external program. It's a common enough pattern that it would be good to document for those looking for answers to why what they do in, e.g., bash, doesn't work in PowerShell.

@mklement0
Copy link
Contributor Author

mklement0 commented Oct 16, 2018

Good suggestion, @2BitSalute; in fact it just came up on SO (which may be how you found this post).

In short, you currently must manually \-escape all embedded " instances in a JSON string in order to pass it as an argument to external program - which is both unexpected and a nuisance.
(Sending the string via the pipeline (stdin) is not affected.)

# Sample JSON
PS> $json = '{ "hello": "world" }'

# Pass to Bash for echoing the argument.
PS> bash -c 'echo $1' - $json
{ hello: world }   # !! Double quotes were stripped.

# Cumbersome workaround: \-escape embedded " chars.
PS> bash -c 'echo $1' - ($json -replace '"', '\"')
{ "hello": "world" }  # OK

On a side note, the Bash command should really be 'echo "$1" for as-is echoing, but those " would get stripped too - with no obvious breakage.

So, in summary, what should work as follows:

PS> bash -c 'echo "$1"' - $json

currently has to be:

PS> bash -c 'echo \"$1\"' - ($json -replace '"', '\"')

Backward compatibility is the only reason to hang on to these obscure requirements.

@aminya
Copy link

aminya commented Feb 16, 2020

How can I write a command that works both on Windows and Linux? This is crucial for Github Actions' workflow files.

Edit:
On Github Actions, using shell: pwsh (PowerShell 6) and running the scripts like the following works both on Windows and Linux (using ' around the command and escaping " using \:

julia -e 'using Pkg; Pkg.add(PackageSpec(url = \"https://github.com/aminya/SnoopCompile.jl\", rev = \"multios\"));'

So the only difference with bash is the need for explicit escaping of " using \

@mklement0
Copy link
Contributor Author

An update for those not following the discussion in PowerShell/PowerShell#1995:

PowerShell/PowerShell#1995 (comment) states that no action will be taken to fix the underlying problem in v7.1

In the meantime, as an alternative to the currently required workarounds for passing arguments properly to external programs from PowerShell - which would break if and when the underlying issues get fixed - you can use the PSv3+ ie helper function from the Native module (in PSv5+, install with Install-Module Native from the PowerShell Gallery), which internally compensates for all broken behavior and allows passing arguments as expected; to use it, simply prepend ie to your invocations; e.g.:
ie pwsh -noprofile -command ' "hi" ' works as expected.

mcdonaldseanp added a commit to mcdonaldseanp/bolt that referenced this issue Feb 16, 2021
This commit updates the PuppetBolt powershell module to correctly escape hash
parameters for linux platforms. Conversation in this thread:
MicrosoftDocs/PowerShell-Docs#2361 describe the
circumstances of why interpolation breaks.

The fix is to include backslash escape chars for all quotes in hash parameters
if the powershell function is running on linux
mcdonaldseanp added a commit to mcdonaldseanp/bolt that referenced this issue Feb 16, 2021
This commit updates the PuppetBolt powershell module to correctly escape hash
parameters for linux platforms. Conversation in this thread:
MicrosoftDocs/PowerShell-Docs#2361 describe the
circumstances of why interpolation breaks.

The fix is to include backslash escape chars for all quotes in hash parameters
if the powershell function is running on linux

!no-release-note
mcdonaldseanp added a commit to mcdonaldseanp/bolt that referenced this issue Feb 16, 2021
This commit updates the PuppetBolt powershell module to correctly escape hash
parameters for linux platforms. Conversation in this thread:
MicrosoftDocs/PowerShell-Docs#2361 describe the
circumstances of why interpolation breaks.

The fix is to include backslash escape chars for all quotes in hash parameters
if the powershell function is running on linux

!no-release-note
@Luiz-Monad
Copy link

escape
sweet lord of escapes !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-native-cmds Area - native command support Pri2 Priority - Medium up-for-grabs Tag - issue is open for any contributor to resolve
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants