Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--nocapture doesn't follow common CLI conventions, making it a stumbling block to people debugging failures #133073

Open
epage opened this issue Nov 15, 2024 · 18 comments · May be fixed by #139224
Labels
A-libtest Area: `#[test]` / the `test` library C-bug Category: This is a bug. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue.

Comments

@epage
Copy link
Contributor

epage commented Nov 15, 2024

By convention, users would expect to type in --no-capture. The fact that the argument is --nocapture trips people up, especially as they have to wait for their test to compile before they see the failure. Without spelling suggestions, they need to then consult the help to then remember its without the middle -. Unless someone is doing this all the time to build up muscle memory to counteract intuition, this will trip people up each time.

See also

@epage epage added A-libtest Area: `#[test]` / the `test` library C-bug Category: This is a bug. labels Nov 15, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 15, 2024
@epage
Copy link
Contributor Author

epage commented Nov 15, 2024

I propose we audit libtest's CLI for

  • consistency with conventions
  • self-consistecy
  • consistency with the rest of Rust programs
  • consistency with other test runners

And propose we adjust CLI parameters to meet user expectations. By "adust" I mean that we make new parameters that are aliases for the existing parameters, and deprecate the existing parameters by hiding them in help and providing a deprecation message.

As T-testing-devex is considering first-class custom test harnesses which will be expected to follow libtest's "API", the sooner we nail this down, the more likely they won't need to deal with supporting the deprecated API.

#24451 (comment)

I would personally feel it's a little too late to make a change like this,

If we had made this change in 2015, the community migration cost would have been small. Its larger now but the longer we wait, the larger it becomes. I'm more worried about existing users having problems with this and the large number of potential new users.

@epage
Copy link
Contributor Author

epage commented Nov 15, 2024

Usage: /home/epage/src/personal/cargo/target/debug/deps/cargo-ef836591b671bcbc [OPTIONS] [FILTERS...]

Options:
        --include-ignored
                        Run ignored and not ignored tests
        --ignored       Run only ignored tests
        --force-run-in-process
                        Forces tests to run in-process when panic=abort
        --exclude-should-panic
                        Excludes tests marked as should_panic
        --test          Run tests and not benchmarks
        --bench         Run benchmarks instead of tests
        --list          List all tests and benchmarks
    -h, --help          Display this message
        --logfile PATH  Write logs to the specified file
        --nocapture     don't capture stdout/stderr of each task, allow
                        printing directly
        --test-threads n_threads
                        Number of threads used for running tests in parallel
        --skip FILTER   Skip tests whose names contain FILTER (this flag can
                        be used multiple times)
    -q, --quiet         Display one character per test instead of one line.
                        Alias to --format=terse
        --exact         Exactly match filters rather than by substring
        --color auto|always|never
                        Configure coloring of output:
                        auto = colorize if stdout is a tty and tests are run
                        on serially (default);
                        always = always colorize output;
                        never = never colorize output;
        --format pretty|terse|json|junit
                        Configure formatting of output:
                        pretty = Print verbose output;
                        terse = Display one character per test;
                        json = Output a json document;
                        junit = Output a JUnit document
        --show-output   Show captured stdout of successful tests
    -Z unstable-options Enable nightly-only flags:
                        unstable-options = Allow use of experimental features
        --report-time   Show execution time of each test.
                        Threshold values for colorized output can be
                        configured via
                        `RUST_TEST_TIME_UNIT`, `RUST_TEST_TIME_INTEGRATION`
                        and
                        `RUST_TEST_TIME_DOCTEST` environment variables.
                        Expected format of environment variable is
                        `VARIABLE=WARN_TIME,CRITICAL_TIME`.
                        Durations must be specified in milliseconds, e.g.
                        `500,2000` means that the warn time
                        is 0.5 seconds, and the critical time is 2 seconds.
                        Not available for --format=terse
        --ensure-time   Treat excess of the test execution time limit as
                        error.
                        Threshold values for this option can be configured via
                        `RUST_TEST_TIME_UNIT`, `RUST_TEST_TIME_INTEGRATION`
                        and
                        `RUST_TEST_TIME_DOCTEST` environment variables.
                        Expected format of environment variable is
                        `VARIABLE=WARN_TIME,CRITICAL_TIME`.
                        `CRITICAL_TIME` here means the limit that should not
                        be exceeded by test.
        --shuffle       Run tests in random order
        --shuffle-seed SEED
                        Run tests in random order; seed the random number
                        generator with SEED

For casing, the two that are questionable are

For consistency with the rest of Rust tools

  • --test-threads: Cargo commands refer to this as -j --jobs

For consistency with pytest

  • --no-capture is instead --capture=no with a short flag of -s
    • Other values fd|sys|no|tee-sys

For inspiration from cargo-nextest

  • --ignored, --include-ignored: --run-ignored <default|only|all>
  • --test-threads: -j --thread-threads

If people have other test runners they want to compare, you are welcome to add them! I'm scoping this to the existing, stable features of libtest.

@jieyouxu jieyouxu added T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Nov 15, 2024
@jieyouxu

This comment has been minimized.

@epage
Copy link
Contributor Author

epage commented Nov 15, 2024

My hope is that we can close out the discussion quickly enough that we don't need to split it up. We'll likely need to do an FCP anyways for --nocapture, so might as well look at it holistically.

@jieyouxu

This comment has been minimized.

@epage
Copy link
Contributor Author

epage commented Nov 15, 2024

Hmm, I see why you brought up a tracking issue. My focus on this is purely in naming and not in improving the semantics. My expectation is that we'll be moving filtering into Cargo and re-evaluate the interface then. Moving more logic into Cargo will be unblocked by the json format being stabilized (so Cargo communicates through tests through json) and this will reduce the burden of what custom test harnesses need to implement.

@jieyouxu

This comment has been minimized.

@sminez
Copy link

sminez commented Nov 15, 2024

Is this something that would be possible to pick up once the details are finalised? I'd be interested in taking a look if it was.

@epage
Copy link
Contributor Author

epage commented Nov 15, 2024

From Zulip

Once we have a decision, sure

@epage
Copy link
Contributor Author

epage commented Nov 19, 2024

We talked about this in today's T-testing-devex meeting

  • --nocapture: we were on board with changing this to --no-capture through deprecation/addition
  • --test-threads: we are deferring this to see how things evolve for thread control within the compiler
  • We didn't see anything else worth changing at this time with the assumption that most of the UX responsibility will be shifting to tools calling libtest, like cargo test.

Let's see if this works

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 19, 2024

Team member @epage has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Nov 19, 2024
@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Dec 16, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Dec 16, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Dec 26, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Dec 26, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@djc
Copy link
Contributor

djc commented Dec 27, 2024

@epage thanks for pushing to clean this up!!

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Jan 9, 2025
@epage
Copy link
Contributor Author

epage commented Jan 27, 2025

Something we had overlooked was that RUST_TEST_NOCAPTURE exists which checks for a non-zero value to disable capturing (including empty).

Options

  • Defer this
  • Add RUST_TEST_NO_CAPTURE
  • Switch to positive logic with RUST_TEST_CAPTURE
    • Maybe re-evaluate the true/false condition
  • Deprecate it

Thoughts?

@djc
Copy link
Contributor

djc commented Jan 28, 2025

Adding RUST_TEST_NO_CAPTURE would make sense to me. Switching to positive logic (while I usually prefer it) feels like a mismatch with the CLI, so maybe it doesn't make sense here.

@weihanglo
Copy link
Member

It seems reasonable to do the same deprecation to env vars.

@epage
Copy link
Contributor Author

epage commented Feb 3, 2025

Thinking more on this, I'm leaning towards "defer".

--no-capture is a nice quality of life improvement that I'd rather not block on "bigger questions". For me, the big question I'm starting to wonder about is if we should deprecate the env variables. My suspicion is that they are a low-effort config system that is meant to make up for a lacking in the common test runner (cargo test). In that case, we should instead focus on the common test runner. These could also cause problems as Cargo tries to coordinate the behavior of test binaries. If cargo test is trying to read json output and the user bypasses cargo to tell the test to report other content, that could become problematic.

epage added a commit to epage/rust that referenced this issue Apr 1, 2025
This improves consistency with commonly expected CLI conventions,
avoiding a common stutter people make when running tests (trying what
they expect and then having to check the docs to then user whats
accepted).

An alternative could have been to take a value, like `--capture <value>` (e.g. `pytest` does this).
Overall, we're shifting focus for features to custom test harnesses (see rust-lang#134283).
Most of `pytest`s modes will likely be irrelevant in that situation.
As for the rest, its too early to tell which, if any, may be relevant,
so we're sticking with this small, quality of life improvement.

By deprecating `--nocapture`, we intend that custom test harnesses do
not need to support it for reasons outside of their own compatibility
requirements, much like the deprecation in rust-lang#134283

I'm punting for now on the naming of `RUST_TEST_NOCAPTURE`.
I feel like T-testing-devex should do a wider look at environment
variables role in lib`test` before evaluating whether to
- Deprecate it in favor of the user passing CLI flags or the test runner
  providing its own config
- Deprecate in favor of `RUST_TEST_NO_CAPTURE`
- Deprecate in favor of `RUST_TEST_CAPTURE`

Other CLI flags were evaluated for casing consistency:
- `--logfile` has the same problem but was deprecated in rust-lang#134283

Fixes rust-lang#133073
epage added a commit to epage/rust that referenced this issue Apr 10, 2025
This improves consistency with commonly expected CLI conventions,
avoiding a common stutter people make when running tests (trying what
they expect and then having to check the docs to then user whats
accepted).

An alternative could have been to take a value, like `--capture <value>` (e.g. `pytest` does this).
Overall, we're shifting focus for features to custom test harnesses (see rust-lang#134283).
Most of `pytest`s modes will likely be irrelevant in that situation.
As for the rest, its too early to tell which, if any, may be relevant,
so we're sticking with this small, quality of life improvement.

By deprecating `--nocapture`, we intend that custom test harnesses do
not need to support it for reasons outside of their own compatibility
requirements, much like the deprecation in rust-lang#134283

I'm punting for now on the naming of `RUST_TEST_NOCAPTURE`.
I feel like T-testing-devex should do a wider look at environment
variables role in lib`test` before evaluating whether to
- Deprecate it in favor of the user passing CLI flags or the test runner
  providing its own config
- Deprecate in favor of `RUST_TEST_NO_CAPTURE`
- Deprecate in favor of `RUST_TEST_CAPTURE`

Other CLI flags were evaluated for casing consistency:
- `--logfile` has the same problem but was deprecated in rust-lang#134283

Fixes rust-lang#133073
epage added a commit to epage/rust that referenced this issue Apr 10, 2025
This improves consistency with commonly expected CLI conventions,
avoiding a common stutter people make when running tests (trying what
they expect and then having to check the docs to then user whats
accepted).

An alternative could have been to take a value, like `--capture <value>` (e.g. `pytest` does this).
Overall, we're shifting focus for features to custom test harnesses (see rust-lang#134283).
Most of `pytest`s modes will likely be irrelevant in that situation.
As for the rest, its too early to tell which, if any, may be relevant,
so we're sticking with this small, quality of life improvement.

By deprecating `--nocapture`, we intend that custom test harnesses do
not need to support it for reasons outside of their own compatibility
requirements, much like the deprecation in rust-lang#134283

I'm punting for now on the naming of `RUST_TEST_NOCAPTURE`.
I feel like T-testing-devex should do a wider look at environment
variables role in lib`test` before evaluating whether to
- Deprecate it in favor of the user passing CLI flags or the test runner
  providing its own config
- Deprecate in favor of `RUST_TEST_NO_CAPTURE`
- Deprecate in favor of `RUST_TEST_CAPTURE`

Other CLI flags were evaluated for casing consistency:
- `--logfile` has the same problem but was deprecated in rust-lang#134283

Fixes rust-lang#133073
epage added a commit to epage/rust that referenced this issue Apr 11, 2025
This improves consistency with commonly expected CLI conventions,
avoiding a common stutter people make when running tests (trying what
they expect and then having to check the docs to then user whats
accepted).

An alternative could have been to take a value, like `--capture <value>` (e.g. `pytest` does this).
Overall, we're shifting focus for features to custom test harnesses (see rust-lang#134283).
Most of `pytest`s modes will likely be irrelevant in that situation.
As for the rest, its too early to tell which, if any, may be relevant,
so we're sticking with this small, quality of life improvement.

By deprecating `--nocapture`, we intend that custom test harnesses do
not need to support it for reasons outside of their own compatibility
requirements, much like the deprecation in rust-lang#134283

I'm punting for now on the naming of `RUST_TEST_NOCAPTURE`.
I feel like T-testing-devex should do a wider look at environment
variables role in lib`test` before evaluating whether to
- Deprecate it in favor of the user passing CLI flags or the test runner
  providing its own config
- Deprecate in favor of `RUST_TEST_NO_CAPTURE`
- Deprecate in favor of `RUST_TEST_CAPTURE`

Other CLI flags were evaluated for casing consistency:
- `--logfile` has the same problem but was deprecated in rust-lang#134283

Fixes rust-lang#133073
epage added a commit to epage/rust that referenced this issue Apr 11, 2025
This improves consistency with commonly expected CLI conventions,
avoiding a common stutter people make when running tests (trying what
they expect and then having to check the docs to then user whats
accepted).

An alternative could have been to take a value, like `--capture <value>` (e.g. `pytest` does this).
Overall, we're shifting focus for features to custom test harnesses (see rust-lang#134283).
Most of `pytest`s modes will likely be irrelevant in that situation.
As for the rest, its too early to tell which, if any, may be relevant,
so we're sticking with this small, quality of life improvement.

By deprecating `--nocapture`, we intend that custom test harnesses do
not need to support it for reasons outside of their own compatibility
requirements, much like the deprecation in rust-lang#134283

I'm punting for now on the naming of `RUST_TEST_NOCAPTURE`.
I feel like T-testing-devex should do a wider look at environment
variables role in lib`test` before evaluating whether to
- Deprecate it in favor of the user passing CLI flags or the test runner
  providing its own config
- Deprecate in favor of `RUST_TEST_NO_CAPTURE`
- Deprecate in favor of `RUST_TEST_CAPTURE`

Other CLI flags were evaluated for casing consistency:
- `--logfile` has the same problem but was deprecated in rust-lang#134283

Fixes rust-lang#133073
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-libtest Area: `#[test]` / the `test` library C-bug Category: This is a bug. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue.
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

8 participants