Fix: Introduce an interface to expose the current `Command` captured env var logic

# Proposal

## Context

The `Command` struct allows developers to execute system commands (such as `bundle install` through a Rust interface). It includes modifying (set) and reading (get): env, current working directory, args, and program name.

When you execute a command, it will resolve both "explicit" environment variables (set via `env`, `envs`, `env_remove`) and implicit "inherited" environment variables (modifiable by `env_clear`) via this logic https://github.com/rust-lang/rust/blob/3a5c8e91f094bb1cb1346651fe3512f0b603d826/library/std/src/sys_common/process.rs#L36-L51 and then pass the result to the operating system. I will refer to this code as environment variable "capture" logic.

There are two problems with this process, detailed in the next section.

When a `Command` runs, the environment variables it uses are affected by:

- Explicit mappings set (via `Command::envs` and `Command::env`)
- Implicit mappings, inherited via the parent process
- Explicitly denied-listed single inherited mapping (via `Command::remove_env`) 
- Implicit deny-list all inherited mappings (via `Command::env_clear`)

To reliably determine the exact environment variables a child process used when it ran, we must have those four pieces of information. There is a public API to determine the explicit mappings `Command::get_envs`. This API also tells us what values were explicitly removed via `Command::remove_env`. However, it does not include information about inherited environment variables. From the [docs](https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs): 

> When `Command::env_clear` executes, it sets an [internal state of `clear=true`](https://github.com/rust-lang/rust/blob/03b01c5bec658081605ab078ad3fbcdb6b30f6c2/library/std/src/sys_common/process.rs#L78-L81). With this flag, the child process is told to not inherit any environment variable mapping from the parent. 

That means we've got four possible states an environment variable can be in:

- A) env_clear NOT called, explicit mapping set
- B) env_clear called, explicit mapping set
- C) env_clear NOT called, explicit mapping NOT set
- D) env_clear called, explicit mapping NOT set

Without knowing if `env_clear` was called, states A & B and B & D are indistinguishable. This lack of information prevents us from determining the exact environment variables a `Command` used when it ran.

## Problem statement(s)

- Problem 1: Cannot observe effects of `Command::env_clear`
- Problem 2: The "capture" logic has no exposed single source of truth

Problem 1 is a capability problem (I can't work around this). Problem 2 is a usability and stability problem.

### Problem 1: Cannot observe effects of `Command::env_clear`

As documented in [my merged PR](https://github.com/rust-lang/rust/pull/109272), calling `env_clear` will cause the "capture" logic to skip inheriting from the parent process. While a developer can set this value, they cannot programmatically observe if it has been set (though they can manually observe it via "alternative" debug formatting `{:#?}`).

The end goals of programmatically observing that effect is listed below in "Motivation".

This problem prevents me from reproducing the capture logic in a library where I cannot observe if `env_clear` has been called. Even if this problem is solved, a related problem is detailed below.

### Problem 2: The "capture" logic has no exposed single source of truth

Even if we could directly observe the effects of calling `env_clear`, every developer who needed this information would need to reproduce this "capture" logic https://github.com/rust-lang/rust/blob/3a5c8e91f094bb1cb1346651fe3512f0b603d826/library/std/src/sys_common/process.rs#L36-L51 and hope that it does not change or that their implementation does not diverge.

If the outcome of this logic is exposed, then it will:

- Solve problem 1
- Prevent divergent reimplementation of "capture" logic

## Out of scope

- Env var [TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)

Like other system resources (such as files on disk), environment variables may be modified between the time of check and use. This TOCTOU problem is out of the scope of my motivation for opening this issue. If you have that problem, please open a separate issue.

## Motivation, use-cases

I want to use a `Command` struct as the single point of truth for what will run in the future or what already ran, including environment variables.

### Motivation: Context

I maintain the [Ruby buildpack, written in Rust](https://github.com/heroku/buildpacks-ruby). The job of a buildpack is to take in a user's application, perform work, and then output an OCI image. A core part of this work is running system commands. Such as `bundle install` and `rake assets:precompile`. Other buildpacks target other ecosystems: nodejs, python, PHP, go, and Java.

As part of providing this experience, I need to:

- To advertise what command is being run before it runs.
- Give a user detailed information when a command unexpectedly fails after it runs.

To do that with Rust, I want to:

- Programmatically displaying a command's environment variables before it executes for an end-user.
- Programmatically adding system information after a failed command execution for an end-user (e.g., [which_problem](https://docs.rs/which_problem/latest/which_problem/))

I cannot accomplish these while using `& mut Command` as a single point of truth today. See below for more details. 

### Motivation: Programmatically displaying a command's environment variables before it executes for an end-user.

Commands like `rake assets:precompile` depend on environment variables for configuration (like `RAILS_ENV=production`). It's important that I can display the command before it executes because:

- If it freezes, users need to be able to try to reproduce the freeze
- If it fails, users will try to reproduce the failure

To that end, I want my user to see an output in their build log like:

```
Running: $ RAILS_ENV=production bin/rake assets:precompile
```

While I can provide this type of output for my users today, I cannot generalize it into a library for other buildpacks while only consuming `&mut Command` without introducing another point of truth.

Here's an example function:

```rust
// This code is invalid as it cannot determine the actual env state.
//
// Either we can assume `env_clear` was called, and only explicit
// values should be used, or assume it wasn't called
// and then implement the env var inheritance logic manually and
// hope there's no mismatch between our logic and Rust's.
//
pub fn command_name_with_env(
    cmd: &mut Command,
    keys: impl IntoIterator<Item = impl Into<OsString>>,
) -> String {
    let env = cmd
        .get_envs()
        .filter_map(|(k, v)| v.map(|value| (k.to_os_string(), value.to_os_string())))
        .collect::<Vec<(OsString, OsString)>>();

    display_with_keys(cmd, env, keys)
}
```

You could call it like:

```rust
command_name_with_env(&command, ["RAILS_ENV"])
```

### Motivation: Programmatically adding system information after a failed command execution for an end-user (e.g. [which_problem](https://docs.rs/which_problem/latest/which_problem/))

I want to be able to debug `PATH` issues on a `Command`. The developer may accidentally call `env_clear` and fail to realize that it also removes `PATH`, or they might think they added something to the `PATH` but had a typo.

To help debug those cases, I wrote a library that performs a `which` lookup and adds debugging information [which_problem crate](https://github.com/schneems/which_problem). I want to use it to debug when a command has failed. 

Something like this:

```rust
use std::ffi::OsString;
use std::process::Command;
use which_problem::Which;

fn main() {
    println!("Hello, world!");

    let mut command = Command::new("bundle");
    command.arg("install");
    let output = command.output();

    output
        .map_err(|error| {
            eprintln!("Executing command '{command:?}' failed.\nError: {error}");

            match which_problem_from_command(&command).diagnose() {
                Ok(details) => println!("Diagnostic info: {details}"),
                Err(error) => println!("Warning: Internal which_problem error: {error}"),
            }
            error
        })
        .unwrap();
}

// This is not valid as it cannot determine the actual env state.
//
// Either we can assume `env_clear` was called, and only explicit
// values should be used, or we have to assume it wasn't called
// and then implement the env var inheritance logic manually and
// hope there's not a mismatch between our logic and Rust's
//
fn which_problem_from_command(command: &Command) -> which_problem::Which {
    let cwd = command.get_current_dir().map(|path| path.to_path_buf());
    let program = command.get_program().to_owned();
    let path_env = command
        .get_envs()
        .find_map(|(key, maybe_value)| maybe_value.filter(|_| key == &OsString::from("PATH")))
        .map(|value| value.to_owned());

    Which {
        cwd,
        program,
        path_env,
        ..Default::default()
    }
}
```

However, the code must assume something about the state of `Command` as it cannot be directly observed. Giving the wrong path information to the person debugging will actively harm the debugging experience.


## Solution sketches

A quick note on naming things: I've stubbed in names for methods, but I'm not tied to any of them.

Internally the combination of explicit and implicit env vars are retrieved via a `capture` or `captured` method. These are described as the env vars that will be captured (or were previously) by the command. 

I've also considered the following: 

- `canonical` (like used for paths). 
- `resolved`
- `finalized`

They're used interchangeably below.

### [sketch] Expose captured env logic via a new `captured_envs` method on `Command`

> Update: A [sample implementation of this is here](https://github.com/rust-lang/rust/pull/110327)

We could add a method that exposes that  information directly:

```rust
let command = Command::new("ls");

let envs: HashMap<OsString, OsString> = command.captured_envs().collect();
```

This sketch would solve problems 1 and 2. I'm open to naming suggestions.

### [sketch] Expose captured env logic via a pure function that takes `Command` and `VarOs` or an iterator

For example:

```rust
/// General idea, not proposing this exact code, I didn't compile this
///
pub fn capture_envs(command: &Command, inherited: impl IntoIter<(OsString, OsString)>) -> CapturedEnvs {
        let mut result = BTreeMap::<EnvKey, OsString>::new();
        if !command.is_env_clear {
            for (k, v) in inherited {
                result.insert(k.into(), v);
            }
        }
        for (k, maybe_v) in &command.vars {
            if let &Some(ref v) = maybe_v {
                result.insert(k.clone(), v.clone());
            } else {
                result.remove(k);
            }
        }
        CapturedEnvs { result.into_iter() }
    }
```

The benefit of this approach is that it's now a pure function and does not rely on `env::vars_os()` call to the current environment. For this to work, we would also have to expose `Command.is_env_clear` internal state.

This sketch would solve problems 1 and 2. I'm open to naming suggestions.

### [sketch] Expose the clear boolean state

If I had a flag on whether or not clear had been called on the `Command` struct, I would have enough information to resolve environment variables manually. Something like:

```rust
command.is_env_clear();
command.get_env_clear();
```

Internally the state is preserved on `CommandEnvs`, so another option could be to mirror the internal state and expose it on the iterator:

```rust
command.get_envs().is_clear()
```

This sketch would solve problem 1 but not 2. The end programmer must still reimplement the environment variable capture logic, which may diverge. I'm open to naming suggestions.

Even if we choose another solution, we may want to consider exposing this information anyway, as all other explicitly set states on `Command` can be queried except for this one.

### [sketch] Expose captured env logic via method chain method on `Command`

As an alternative to the above `Command::captured_envs` method, we could make the information chainable:

```rust
let command = Command::new("ls");

let envs: HashMap<OsString, OsString> = command.get_envs().capture();
```

Like the above chained `command.get_envs().is_clear()`, this maps somewhat to the internal representation.

Benefits: The methods of `Command` are not changed.
Downsides: More OOP than functional style. It removes the ability to use `Command::captured_envs` in a chain, like calling `map` on an option.

This sketch solves problems 1 and 2. I'm open to naming suggestions.

### [sketch] Something else

Any ideas?

## [sketch] Deprecate `get_envs` and introduce a new interface. 

Deprecate `get_envs` and introduce other APIs. Motivation is provided below. An example could be:

```rust
command.get_removed_envs()
command.get_explicit_envs()
command.get_captured_envs()
```

While this idea is a stretch, I wanted to document that it was possible (in brainstorming, quantity leads to quality) and give some additional context on the existing interface.

When I first saw `get_envs`, I assumed it was referring to getting the state that would be set when it was run rather than the explicit state that must be resolved. 

I was not paying close attention when I first encountered `get_envs`. I  guessed what I thought it meant (and was REALLY wrong).

In the case of a `None` value, it doesn't mean:

> "None was given, fallback to a parent's value." 

It means: 

> "Someone explicitly told me to unset this singular env var and not fallback." 

That behavior is the opposite of what I assumed.

While it's ultimately my mistake to own, I want to explore how we could improve that experience and provide a new interface for the captured/canonical/resolved envs.

I introduced a documentation PR and added some wording in the header, specifically "explicitly," which I hope will catch people's attention. https://github.com/rust-lang/rust/pull/109272, though there's still behavior inconsistency. Of the `get` methods, these have no ambiguity:

- get_args
- get_program

While these have some:

- get_envs
- get_current_dir

I've talked at length about `get_envs`. The other, `get_current_dir`, returns either `Some` if an explicit value was set or `None` if it wasn't. I'm not personally surprised by what happens when a `None` is seen, but I'm not all users. It is confusing that `None` for `get_current_dir` and `None` for `get_envs` means opposite things.

## Links and related work

- PR to update Command docs https://github.com/rust-lang/rust/pull/109272/files
- Because the majority of my work is open source, you can somewhat follow along with my mental state and how I ultimately found this issue, and also why I think it's important to be able to pull out canonical data from `&mut Command`.
	- Initial OO approach using a surrogate struct for `Command`  (written when I knew much less about Rust, so the design and implementation aren't the greatest) https://github.com/heroku/buildpacks-ruby/blob/a63337c764a816122e7ccaea50005cfa85482e90/commons/src/env_command.rs
	- Functional grab-bag approach https://github.com/heroku/buildpacks-ruby/pull/120/commits/cca6c29520cd8fe5b6ec3897c93bfb8f01be67d2. This is closer to a more idiomatic interface, but still not quite what I want.
	-  Work that lead me to discover this `Command` issue: A mix of structs to hold state transitions but relying on `&mut Command` to hold canonical data: https://github.com/heroku/buildpacks-ruby/pull/120/commits/cad093a10656f4a9d2b8adf1dd36b3c5e79fa345 (this commit message may also help you to understand my design goals and needs when working with `Command`)
    - Reddit [question on /r/learnrust](https://www.reddit.com/r/learnrust/comments/11sgsqt/detect_if_clear_env_has_been_called_on_a_command/)

## What happens now?

This issue is part of the libs-api team [API change proposal process]. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

[API change proposal process]: https://std-dev-guide.rust-lang.org/feature-lifecycle/api-change-proposals.html


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Introduce an interface to expose the current `Command` captured env var logic #194

Proposal

Context

Problem statement(s)

Problem 1: Cannot observe effects of `Command::env_clear`

Problem 2: The "capture" logic has no exposed single source of truth

Out of scope

Motivation, use-cases

Motivation: Context

Motivation: Programmatically displaying a command's environment variables before it executes for an end-user.

Motivation: Programmatically adding system information after a failed command execution for an end-user (e.g. which_problem)

Solution sketches

[sketch] Expose captured env logic via a new `captured_envs` method on `Command`

[sketch] Expose captured env logic via a pure function that takes `Command` and `VarOs` or an iterator

[sketch] Expose the clear boolean state

[sketch] Expose captured env logic via method chain method on `Command`

[sketch] Something else

[sketch] Deprecate `get_envs` and introduce a new interface.

Links and related work

What happens now?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix: Introduce an interface to expose the current Command captured env var logic #194

Description

Proposal

Context

Problem statement(s)

Problem 1: Cannot observe effects of Command::env_clear

Problem 2: The "capture" logic has no exposed single source of truth

Out of scope

Motivation, use-cases

Motivation: Context

Motivation: Programmatically displaying a command's environment variables before it executes for an end-user.

Motivation: Programmatically adding system information after a failed command execution for an end-user (e.g. which_problem)

Solution sketches

[sketch] Expose captured env logic via a new captured_envs method on Command

[sketch] Expose captured env logic via a pure function that takes Command and VarOs or an iterator

[sketch] Expose the clear boolean state

[sketch] Expose captured env logic via method chain method on Command

[sketch] Something else

[sketch] Deprecate get_envs and introduce a new interface.

Links and related work

What happens now?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Fix: Introduce an interface to expose the current `Command` captured env var logic #194

Problem 1: Cannot observe effects of `Command::env_clear`

[sketch] Expose captured env logic via a new `captured_envs` method on `Command`

[sketch] Expose captured env logic via a pure function that takes `Command` and `VarOs` or an iterator

[sketch] Expose captured env logic via method chain method on `Command`

[sketch] Deprecate `get_envs` and introduce a new interface.