-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include executed tests in the build metrics (and use a custom test display impl) #108659
Conversation
f56ced5
to
ad9a444
Compare
Behind the scenes Clippy uses compiletest-rs, which doesn't support the --json flag we added to Rust's compiletest.
I noticed this removes the output from Cargo and places it at the end. For example, it now shows:
instead of:
Because it delays writing the stderr till the end. That can make it difficult to see which test suite is running at any one time. It also removes any sense of progress, as you just see it hang while cargo is building things (no progress bar, no "Compiling" messages, etc.). Is it at all possible to process both stderr and stdout at the same time? Another concern is that delaying reading stderr could potentially deadlock. It's not too hard to generate enough data on stderr to fill the pipe buffer (particularly on platforms with smaller buf sizes). For example, running If you run with something like |
I made the change to show the standard error only at the end because without it the message compiletest displays when something fails gets interleaved with the failure message. I agree with you that doesn't work, I'll try a different approach.
Right, forgot about |
0faacca
to
4958272
Compare
So, to reiterate the problem with stderr, it was that compiletest emits a message at the end of a failing run to stderr, and that message would be always interleaved. The solutions that came to mind were:
I went with option 4., let me know if you want a different approach to be used or if there are approaches I didn't consider. |
An alternative to option 4 is to redirect stdout and stderr to the same pipe or file. |
Wouldn't that have the same problem as option 2 (losing colors)? |
Right. Unless you did use a pseudo terminal, but that is not easily portable to Windows. |
We only lose colors on the " Running tests/ppc.rs (obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/ppc-8d20debf1e8d2487)", right? Or on both that and test output? I feel like that line not being colored is.. fine, but then I typically don't find colors very useful anyway... |
@Mark-Simulacrum we would lose:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks like this should work.
It seems unfortunate to need to write such a large amount of code for rendering. I'm wondering if an alternate solution would be to extend --logfile
to contain the information you want. I realize that would be a more difficult change to make, but perhaps something to pursue later.
To support that, I think there would need to be some mechanism to indicate the format for the logfile. I'm not sure if that is just the combination of --format
and --logfile
.
Another thing that would need to be addressed is having some sort of templated filename. Right now, --logfile
will overwrite the file. That is a problem for running cargo test
on something that runs multiple tests (like multiple integration tests, or something with doctests). Somehow one would need to be able to map the file to the kind of test being run.
(Or have --logfile
extend the file, and have extra information about what is being run.)
c89f68d
to
e62c37a
Compare
☀️ Test successful - checks-actions |
Finished benchmarking commit (6667682): comparison URL. Overall result: ❌ regressions - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
…_output, r=pietroalbini Bugfix: avoid panic on invalid json output from libtest rust-lang#108659 introduces a custom test display implementation. It does so by using libtest to output json. The stdout is read and parsed; The code trims the line read and checks whether it starts with a `{` and ends with a `}`. If so, it concludes that it must be a json encoded `Message`. Unfortunately, this does not work in all cases: - This assumes that tests running with `--nocapture` will never start and end lines with `{` and `}` characters - Output is generated by issuing multiple `write_message` [statements](https://github.com/rust-lang/rust/blob/master/library/test/src/formatters/json.rs#L33-L60). Where only the last one issues a `\n`. This likely results in a race condition as we see multiple json outputs on the same line when running tests for the `x86_64-fortanix-unknown-sgx` target: ``` 10:21:04 �[0m�[0m�[1m�[32m Running�[0m tests/run-time-detect.rs (build/x86_64-unknown-linux-gnu/stage1-std/x86_64-fortanix-unknown-sgx/release/deps/run_time_detect-8c66026bd4b1871a) 10:21:04 10:21:04 running 1 tests 10:21:04 test x86_all ... ok 10:21:04 �[0m�[0m�[1m�[32m Running�[0m tests/thread.rs (build/x86_64-unknown-linux-gnu/stage1-std/x86_64-fortanix-unknown-sgx/release/deps/thread-ed5456a7d80a6193) 10:21:04 thread 'main' panicked at 'failed to parse libtest json output; error: trailing characters at line 1 column 135, line: "{ \"type\": \"suite\", \"event\": \"ok\", \"passed\": 1, \"failed\": 0, \"ignored\": 0, \"measured\": 0, \"filtered_out\": 0, \"exec_time\": 0.000725911 }{ \"type\": \"suite\", \"event\": \"started\", \"test_count\": 1 }\n"', render_tests.rs:108:25 ``` This PR implements a partial fix by being much more conservative of what it asserts is a valid json encoded `Message`. This prevents panics, but still does not resolve the race condition. A discussion is needed where this race condition comes from exactly and how it best can be avoided. cc: `@jethrogb,` `@pietroalbini`
…huss Validate `ignore` and `only` compiletest directive, and add human-readable ignore reasons This PR adds strict validation for the `ignore` and `only` compiletest directives, failing if an unknown value is provided to them. Doing so uncovered 79 tests in `tests/ui` that had invalid directives, so this PR also fixes them. Finally, this PR adds human-readable ignore reasons when tests are ignored due to `ignore` or `only` directives, like *"only executed when the architecture is aarch64"* or *"ignored when the operative system is windows"*. This was the original reason why I started working on this PR and rust-lang#108659, as we need both of them for Ferrocene. The PR is a draft because the code is extremely inefficient: it calls `rustc --print=cfg --target $target` for every rustc target (to gather the list of allowed ignore values), which on my system takes between 4s and 5s, and performs a lot of allocations of constant values. I'll fix both of them in the coming days. r? `@ehuss`
fix running Miri tests This partially reverts rust-lang#108659 to fix rust-lang#110102: the Miri test runner does not support any flags, they are interpreted as filters instead which leads to no tests being run. I have not checked any of the other test runners for whether they are having any trouble with these flags. Cc `@pietroalbini` `@Mark-Simulacrum` `@jyn514`
fix running Miri tests This partially reverts rust-lang/rust#108659 to fix rust-lang/rust#110102: the Miri test runner does not support any flags, they are interpreted as filters instead which leads to no tests being run. I have not checked any of the other test runners for whether they are having any trouble with these flags. Cc `@pietroalbini` `@Mark-Simulacrum` `@jyn514`
…ynchronization, r=pietroalbini Ensure test library issues json string line-by-line rust-lang#108659 introduces a custom test display implementation. It does so by using libtest to output json. The stdout is read line by line and parsed. The code trims the line read and checks whether it starts with a `{` and ends with a `}`. Unfortunately, there is a race condition in how json data is written to stdout. The `write_message` function calls `self.out.write_all` repeatedly to write a buffer that contains (partial) json data, or a new line. There is no lock around the `self.out.write_all` functions. Similarly, the `write_message` function itself is called with only partial json data. As these functions are called from concurrent threads, this may result in json data ending up on the same stdout line. This PR avoids this by buffering the complete json data before issuing a single `self.out.write_all`. (rust-lang#109484 implemented a partial fix for this issue; it only avoids that failed json parsing would result in a panic.) cc: `@jethrogb,` `@pietroalbini`
The main goal of this PR is to include all tests executed in CI inside the build metrics JSON files. I need this for Ferrocene, and @Mark-Simulacrum expressed desire to have this as well to ensure all tests are executed at least once somewhere in CI.
Unfortunately implementing this required rewriting inside of bootstrap all of the code to render the test output to console. libtest supports outputting JSON instead of raw text, which we can indeed use to populate the build metrics. Doing that suppresses the console output though, and compared to rustc and Cargo the console output is not included as a JSON field.
Because of that, this PR had to reimplement both the "pretty" format (one test per line, with
rust.verbose-tests = true
), and the "terse" format (the wall of dots, withrust.verbose-tests = false
). The current implementation should have the exact same output as libtest, except for the benchmark output. libtest's benchmark output is broken in the "terse" format, so since that's our default I slightly improved how it's rendered.Also, to bring parity with libtest I had to introduce support for coloring output from bootstrap, using the same dependencies
annotate-snippets
uses. It's now possible to usebuilder.color_for_stdout(Color::Red, "text")
andbuilder.color_for_stderr(Color::Green, "text")
across all of bootstrap, automatically respecting the--color
flag and whether the stream is a terminal or not.I recommend reviewing the PR commit-by-commit.
r? @Mark-Simulacrum