Skip to content

Update Windows exe search order (phase 2) #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ChrisDenton opened this issue Mar 25, 2024 · 20 comments
Closed

Update Windows exe search order (phase 2) #358

ChrisDenton opened this issue Mar 25, 2024 · 20 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. T-libs-api

Comments

@ChrisDenton
Copy link
Member

ChrisDenton commented Mar 25, 2024

Proposal

Problem statement

On Windows, the Command program search uses slightly different rules depending on if the environment is inherited or not. To cut a long store short, this is for historical reasons. Way back in pre-1.0 times the intent was to follow our Unix code in searching the child's PATH. However, after several refactorings this became super buggy and only really worked if Command::env was used. In other cases it (unintentionally) fell back to the standard CreateProcess rules.

Awhile back it was decided to remove the current directory from the Windows Command search path. Which I did. At the time I was a bit worried it would affect people. But as it turned out that didn't appear to have that much of an impact. Or at least I've not heard of anyone having serious issues with it.

I did however preserve some of the buggy env behaviour because, I was worried about making too many changes at once.. However, I do think it needs fixing somehow

Motivating examples or use cases

Assuming that there is an app called hello.exe alongside the current application and also a different hello.exe in PATH:

// Spawns `hello.exe` in the applications' directory
Command::new("hello").spawn()
// Spawns `hello.exe` from PATH
Command::new("hello").env("a", "b").spawn()

Background

Windows CreateProcess search order

When using CreateProcess and not setting lpApplicationName, it will attempt to discover the application in the following places (and in this order):

  1. the parent process' directory
  2. the current directory for the parent process.
  3. the system directories
  4. the parent process' PATH

This is the order (or similar) used by most Windows applications and runtimes.

Rust's Unix search order

  1. the child process' PATH

Note: Rather than using execvpe, Rust sets the environment after forking and then uses execvp. See https://github.com/rust-lang/rust/blob/ed49386d3aa3a445a9889707fd405df01723eced/library/std/src/sys/pal/unix/process/process_unix.rs#L395

Rust's Windows search order

  1. the child process' PATH but only if the child environment is not inherited.
  2. the parent process' directory
  3. the system directories
  4. the parent process' PATH

Obviously this leads to some inconsistencies depending on whether Command::env is used or not.
It was originally intended we just do 1.; so this search order was somewhat accidental.

Solution sketch

There is a tension here between being consistent cross-platform and being consistent with non-Rust applications on Windows. We'd also prefer not break existing applications.

Trying to keep everyone happy is difficult, if not impossible but I think we can do better than we currently are. With that in mind, I would like the search order to be consistently:

  1. the parent process' directory
  2. the system directories
  3. the child process' PATH but only if the child environment is not inherited.
  4. the parent process' PATH

This is more or less the same as now except that the parent process' directory and system directories are consistently searched first.

I'd love to only check either the parent's or child's PATH, not both, but I worry that would be too breaking.

Alternatives

  • Keep the status quo
  • Be more consistent with other Windows applications and don't search the child PATH at all.
  • Be more consistent with other Rust platforms and don't search the parent PATH.
  • Be consistent with neither and only search the parent's PATH (i.e. not the application or system directories).
  • Have a new API for resolving applications, that allows at least some degree of control on how the search is performed. Though that would still need the default behaviour figured out.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@ChrisDenton ChrisDenton added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Mar 25, 2024
@RalfJung
Copy link
Member

RalfJung commented May 28, 2024

For the status quo you wrote

the current application's parent directory

but then for the proposal this becomes

the application's directory

That sounds like it's not the same thing? However, you're not discussing changing that part of the behavior either, so maybe this is meant to refer to the same thing in both cases? I am confused.


GitHub actions has (WSL) bash.exe in C:\Windows\System32 (the system directory). However, some people want to use (git) bash.exe in C:\Program Files\Git\bin. Currently there is no way to override it (unless you know the env hack, which isn't great).

Oh, that may explain some extremely painful and confusing behavior I experienced on GHA recently. Having the system dirs overwrite PATH is certainly surprising and seems to be inconsistent both with Unix behavior and with the normal Windows behavior, so this is definitely something that should be changed.

There are competing demands in fixing this to be more predictable. Some people want this to be as close to the "normal" Windows behaviour as possible, others strongly favour cross-platform consistency.

It would be good to discuss where on the spectrum of "work like Unix" vs "work like normal Windows programs" this proposal lies.

@ChrisDenton
Copy link
Member Author

That sounds like it's not the same thing? However, you're not discussing changing that part of the behavior either, so maybe this is meant to refer to the same thing in both cases? I am confused.

I've reworded to hopefully avoid confusion.

It would be good to discuss where on the spectrum of "work like Unix" vs "work like normal Windows programs" this proposal lies.

I've done a small edit to clarify that comment a bit and I'll attempt to expand upon the point later when I have time to write something more succinct. But here's the long version:

  • Unix essentially creates a new process, sets the new environment variables and then searches PATH (and only PATH) for the executable to load.
  • Windows shells (i.e. cmd and powershell) use the currently set PATH to find executables. This is effectively the same as Linux, except that cmd by default will also run applications in the current directory (though this is configurable and powershell does not do this).
  • The Windows API CreateProcess has its own way of finding a executable if it's not given explicitly (scroll down that page to "1. The directory from..."), Most languages use this because they're relatively thin wrappers around this API.

I'll repost the CreateProcess behaviour here. To be clear the current directory and environment variables are that of the parent process:

  1. The directory from which the application loaded.
  2. The current directory for the parent process.
  3. The 32-bit Windows system directory.
  4. The 16-bit Windows system directory.
  5. The Windows directory.
  6. The directories that are listed in the PATH environment variable.

The Unix behaviour is:

  1. The directories that are listed in the child process' PATH environment variable

If Command::env is not used then Rust currently uses a modified version of the CreateProcess behaviour. It removes 2. for security reasons and it also remove the 16-bit system directory. So it's:

  1. The directory from which the application loaded.
  2. The 32-bit Windows system directory.
  3. The Windows directory.
  4. The directories that are listed in the parent process' PATH environment variable.

If Command::env is used then it tries to mimic the Unix behaviour then falls back to the above if the application is not found:

  1. The directories that are listed in the child process' PATH environment variable
  2. The directory from which the application loaded.
  3. The 32-bit Windows system directory.
  4. The Windows directory.
  5. The directories that are listed in the parent process' PATH environment variable.

The suggested change simplifies this to always doing this:

  1. The directory from which the application loaded.
  2. The directories that are listed in the child process' PATH environment variable.
  3. The directories that are listed in the parent process' PATH environment variable.

Due to unifying the two behaviours, this leads to being slightly less Unix-like in the case where Command::env is used but slightly more Unix-like otherwise. To be honest, I would also prefer to pick only one out of 2. and 3. but I suspect that'd break someone.

@RalfJung
Copy link
Member

RalfJung commented May 28, 2024

Thanks for the clarifications!

The directory from which the application loaded.

Is that what you called "the parent process' directory" above? I thought "the parent process' directory" referred to the current working directory, but "from which the application loaded" sounds more like the directory containing current_exe()?

The directories that are listed in the PATH environment variable.

Oh wow, so this is last normally on Windows? Seems like Windows users could be quite surprised then about Rust giving PATH a lot higher priority. Or maybe it's less "higher priority" and more "skipping the system directories".

@pitaj
Copy link

pitaj commented May 28, 2024

directory from which the application loaded

This is the directory in which the parent process executable resides, correct? Not the current working directory.

I think the simplification you recommend sounds good. I presume we're allowed to deduplicate (2) and (3) when applicable?

@ChrisDenton
Copy link
Member Author

Thanks for the clarifications!

The directory from which the application loaded.

Is that what you called "the parent process' directory" above? I thought "the parent process' directory" referred to the current working directory, but "from which the application loaded" sounds more like the directory containing current_exe()?

Yes, that's right. By "current directory" I meant the current working directory and by "application's directory" I meant the directory containing the application as given by current_exe(),

The directories that are listed in the PATH environment variable.

Oh wow, so this is last normally on Windows? Seems like Windows users could be quite surprised then about Rust giving PATH a lot higher priority. Or maybe it's less "higher priority" and more "skipping the system directories".

The system directories are in the PATH and usually before the user's paths. So from a practical perspective it makes little difference unless the user has modified their PATH to remove them.

And as noted, shells don't do that so there's already inconsistency.

@ChrisDenton
Copy link
Member Author

directory from which the application loaded

This is the directory in which the parent process executable resides, correct? Not the current working directory.

I think the simplification you recommend sounds good. I presume we're allowed to deduplicate (2) and (3) when applicable?

Sure but that's a libs optimization, rather than a libs-api question.

@RalfJung
Copy link
Member

And as noted, shells don't do that so there's already inconsistency.

Yeah seems to be quite the mess.

From a Unix perspective, the most surprising part to me is "The directory from which the application loaded". I guess that's just a sufficiently common pattern on Windows that we have to support it?

@ChrisDenton
Copy link
Member Author

Yes, I believe so. Windows applications are generally packaged in their own directory rather than being in a soup of other applications. So, at least traditionally, giving high priority and trust to the application's load directory is expected.

@joshtriplett
Copy link
Member

Note that on UNIX, execvp and company do not search PATH for the new child, they only search the caller's PATH.

@Amanieu Amanieu added I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. and removed I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. labels Feb 11, 2025
@joshtriplett
Copy link
Member

Discussed in this week's meeting. We do want to drop searching the child's PATH, and only search the parent's PATH.

We had some discussions about whether to drop the application executable's directory. We'd like to understand whether there are any concerns about e.g. running applications directly in download directories. But it sounded like that should be considered separately with a close eye on compatibility, and with some understanding of what Windows already does in other contexts.

@ChrisDenton
Copy link
Member Author

@joshtriplett We do currently have a test that ensures the child's PATH is searched on all platforms. Which links to this issue: rust-lang/rust#15149

Removing that would mean that Windows deviates from other supported platforms.

@joshtriplett
Copy link
Member

Fascinating. I don't think we were aware in the meeting that we search the child PATH on UNIX. Renominating for discussion.

@joshtriplett joshtriplett added the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Feb 18, 2025
@ChrisDenton
Copy link
Member Author

ChrisDenton commented Feb 19, 2025

Ok, I refreshed my memory. I did looked into it a year ago so I had notes. The standard library manually sets the environment after forking and then uses execvp. See https://github.com/rust-lang/rust/blob/ed49386d3aa3a445a9889707fd405df01723eced/library/std/src/sys/pal/unix/process/process_unix.rs#L395. EDIT: I've updated the ACP.

That is surprising but very much intentional. It would have made things simpler for me if we didn't do that but here we are😆.

@ChrisDenton
Copy link
Member Author

Actually I opened a separate issue about documenting the behaviour of Command on all platforms: rust-lang/rust#137286. We should at least document how it works on Unix as apparently it is a bit surprising.

@pitaj
Copy link

pitaj commented Feb 19, 2025

Should we provide an option for setting the search order?

enum ExecutableSearchOrder {
    System,
    Consistent,
    Custom(&[&Path]),
}

@ChrisDenton
Copy link
Member Author

Investigating that was an option I listed in "Alternatives" but:

Though that would still need the default behaviour figured out.

@ChrisDenton
Copy link
Member Author

This was discussed in the libs-api but there wasn't a lot of progress made. However the discussion did make me thing a better approach might be this:

  • if command.env("PATH", "...") is used then
    • search the child's PATH only
  • else
    • use the existing search order

This is easy enough to explain in documentation and it kinda matches the Unix behaviour in spirit if you squint a bit (i.e. does the native thing if you don't set PATH, otherwise it searches the given PATH).

It also should have a very limited effect. If someone really needs the old behaviour they can just append the application directory (and system directories if necessary) to the PATH they pass to Command.

@joshtriplett
Copy link
Member

@ChrisDenton That would still break cases like "I added one utility directory to the PATH, and then ran a program that's found in the directory of my binary".

@ChrisDenton
Copy link
Member Author

rust-lang/rust#137673 fixed the bug with setting any environment variable on Command changing the search order.

@Amanieu
Copy link
Member

Amanieu commented Apr 1, 2025

In the @rust-lang/libs-api meeting we decided to document the current behavior for now. rust-lang/rust#137673 solved the main issue that people were having, but we may want to re-visit this if people are still not happy with the new behavior.

@Amanieu Amanieu closed this as completed Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. T-libs-api
Projects
None yet
Development

No branches or pull requests

5 participants