-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not deduplicate diagnostics in UI tests #67122
Conversation
@rust-lang/wg-diagnostics @rust-lang/compiler |
I'm only worried about adding more divergence between the testing stderr output and what users see. Would it be reasonable to make this a |
This seems like a great idea to me.
Imo the default should be to not de-duplicate since, at least in my view, the primary function of tests is to ensure correctness and not diagnostics quality. Where there are specific tests which look at diagnostics quality they can use a flag. |
@Centril that's fair, having the opposite default in tests, and have the tests opt-in to the deduplication would be fine. |
Hmm, I'm torn. I find it very useful to get an accurate picture of what the diagnostic output is for every test -- and I think it's been quite useful from time to time in catching unexpected regressions in diagnostic quality. However, most of those regressions would be caught with or without this de-duplication. And, as @petrochenkov pointed out, there may be other sorts of regressions (running things twice etc) that we would catch this way. (Though perhaps also by monitoring perf.) One question on the PR, does this mean we have to add corresponding In any case, I guess this is the key point:
I'm not sure if this is true, I can see an argument for it, but I can imagine there are some examples where it is very hard to suppress duplicates, particularly given the query infrastructure (i.e., the desire to compute separate things independently without unnecessary coordination). I guess that if it is a bug for this mechanism to kick in (which doesn't quite seem to be what we're suggesting) then this definitely makes sense. I think what we're saying is more like "it's ok to rely on this mechanism, but it's still useful to detect cases where it kicks in and make them visible"? Anway, I don't really have a strong opinion one way or the other about the change itself, so long as we're not calling all of these cases "bugs" (that part I am not yet convinced of). |
☔ The latest upstream changes (presumably #67268) made this pull request unmergeable. Please resolve the merge conflicts. |
Relevant Zulip discussion - https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/weekly.20meeting.202019-12-19.20.2354818/near/183854498. The conclusion seems to be to proceed with this PR as is, with I was going to add a separate option like |
On the other hand, tying this to |
Blocked on #67709. |
Introduce an option for disabling deduplication of diagnostics With the intent of using it in UI tests (rust-lang#67122). The option is boolean (`-Z deduplicate-diagnostics=yes/no`) and can be specified multiple times with later values overriding earlier values (`-Z deduplicate-diagnostics=no -Z deduplicate-diagnostics=yes` == `-Z deduplicate-diagnostics=yes`), so it can be set in a hierarchical way, e.g. UI testing infra may disable the deduplication by default with specific tests being able to enable it back.
-Z ui-testing
mode
Updated. |
Looks like the worst offenders are:
Some cases are legitimately different errors, just reported for the same span, usually due to macros. |
@petrochenkov the changes look good to me, but I'm a bit uneasy about silent regressions in the deduplication. Could you take a test for each of those three identified cases and make them deduplicate? (I think for one of them it is already the case.) That way the majority of the tests will not hide the duplication but we'll have a way to see if we inadvertently regress. r=me with that change. |
@bors r=estebank |
📌 Commit d426771 has been approved by |
Do not deduplicate diagnostics in UI tests Error reporting infrastructure deduplicates identical diagnostics with identical spans. While it's preferable to do this in "release"/"user-facing" mode, it sometimes brings [confusion](rust-lang#50682 (comment)) and hides details that may be important during development. Do we run some passes multiple times when we could do it once? How many times we run them exactly? Can this number be large? Can the multiplied error construction be expensive? Can speculative checks be made cheaper if they don't report errors? *Relying* on this mechanism to deduplicate some specific error never looks like a proper solution to me personally. In this PR I attempt to disable this deduplication by applying `-Z deduplicate-diagnostics=no` to UI tests.
Failed in #68044 (comment), @bors r- |
@bors r=estebank |
📌 Commit b82cd9f has been approved by |
Do not deduplicate diagnostics in UI tests Error reporting infrastructure deduplicates identical diagnostics with identical spans. While it's preferable to do this in "release"/"user-facing" mode, it sometimes brings [confusion](rust-lang#50682 (comment)) and hides details that may be important during development. Do we run some passes multiple times when we could do it once? How many times we run them exactly? Can this number be large? Can the multiplied error construction be expensive? Can speculative checks be made cheaper if they don't report errors? *Relying* on this mechanism to deduplicate some specific error never looks like a proper solution to me personally. In this PR I attempt to disable this deduplication by applying `-Z deduplicate-diagnostics=no` to UI tests.
Rollup of 10 pull requests Successful merges: - #66254 (Make Layout::new const) - #67122 (Do not deduplicate diagnostics in UI tests) - #67358 (Add HashSet::get_or_insert_owned) - #67725 (Simplify into_key_slice_mut) - #67935 (Relax the Sized bounds on Pin::map_unchecked(_mut)) - #67967 (Delay bug to prevent ICE in MIR borrowck) - #67975 (Export public scalar statics in wasm) - #68006 (Recognise riscv64 in compiletest) - #68040 (Cleanup) - #68054 (doc: add Null-unchecked version section to mut pointer as_mut method) Failed merges: - #67258 (Introduce `X..`, `..X`, and `..=X` range patterns) r? @ghost
Error reporting infrastructure deduplicates identical diagnostics with identical spans.
While it's preferable to do this in "release"/"user-facing" mode, it sometimes brings confusion and hides details that may be important during development.
Do we run some passes multiple times when we could do it once?
How many times we run them exactly? Can this number be large? Can the multiplied error construction be expensive? Can speculative checks be made cheaper if they don't report errors?
Relying on this mechanism to deduplicate some specific error never looks like a proper solution to me personally.
In this PR I attempt to disable this deduplication by applying
-Z deduplicate-diagnostics=no
to UI tests.