-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Fix normalization overflow ICEs in monomorphization #146096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix normalization overflow ICEs in monomorphization #146096
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…ono1, r=<try> Fix normalization overflow ICEs in monomorphization
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (fb35892): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 9.4%, secondary 6.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 11.3%, secondary 16.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 467.236s -> 465.852s (-0.30%) |
@@ -1,3 +1,4 @@ | |||
//@ build-fail | |||
//@ known-bug: #105937 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//@ known-bug: #105937 |
right? Same elsewhere.
I have to make the check a query in order to cache its result for incremental build. |
☔ The latest upstream changes (presumably #145717) made this pull request unmergeable. Please resolve the merge conflicts. |
// may be expensive. | ||
fn has_normalization_error_in_mono<'tcx>(tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> bool { | ||
let body = tcx.instance_mir(instance.def); | ||
body.local_decls.iter().any(|local| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you instantiating types seen in local_decls instead of the entire body? I'm worried that an error will sneak by this check because it's hidden somewhere else in the MIR body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did check the whole body at first. Then I realized that instantiation is not cached and worried that the computational cost is too high if the body is huge, so I tried to scale down the check.
Ofc this is just speculation, I'm not sure about the real cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done in a query, which should provide the requisite level of caching. Switch this to check the whole body and I can submit another perf run before merging.
Just one question ^ then I think this is good |
if tcx.has_normalization_error_in_mono(instance) { | ||
let def_id = instance.def_id(); | ||
let def_span = tcx.def_span(def_id); | ||
let def_path_str = tcx.def_path_str(def_id); | ||
tcx.dcx().emit_fatal(RecursionLimit { | ||
span: starting_item.span, | ||
instance, | ||
def_span, | ||
def_path_str, | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move error reporting to the query itself, and make the query return unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do so, the query would be dependent on starting_item.span
which may pollute the cache unnecessarily?
2e360ea
to
a454317
Compare
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
This comment has been minimized.
This comment has been minimized.
a454317
to
ade5b67
Compare
This comment has been minimized.
This comment has been minimized.
Without the starting span, users can only see the type of the failed instance and have to manually search and check each call to the function to determine whether the call introduces recursion. One example from test: fn main() {
recurse(std::iter::empty::<()>())
}
fn recurse(nums: impl Iterator) {
if true { return }
recurse(nums.skip(42).peekable())
//~^ ERROR: reached the recursion limit while instantiating
} Users can only see the compiler fails to instantiate And sometimes the recursive function is not the same one as the failed instance, as in this example. |
Could you make |
This comment has been minimized.
This comment has been minimized.
Seems like another spurious rustdoc-gui test failure. |
Nice, that looks like what I was imagining. I've restarted your PR CI jobs in the UI, and in the meantime also @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…ono1, r=<try> Fix normalization overflow ICEs in monomorphization
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (4df8dd0): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.0%, secondary 1.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.4%, secondary -1.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 473.161s -> 475.157s (0.42%) |
@adwinwhite Can you squash the commits down? I would just mash all the commits into one, but if you prefer some kind of logical separation go for it. Then I'll approve this. |
d681e84
to
08f16a9
Compare
Done squashing. I prefer to squash/rebase after the review is complete too. :-) |
The single scary-looking regression is specific to opt incr-full on projection-caching, and this doesn't break projection caching so I don't think it's worth worrying about. Beyond that there are a few tiny regressions which I think are all justified by the fact that this fixes/papers over a systemic source if ICEs. @rustbot label: +perf-regression-triaged @bors r+ |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 2300c2a (parent) -> 36e4f5d (this PR) Test differencesShow 284 test diffsStage 1
Stage 2
Additionally, 240 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 36e4f5d1fe1d63953a5bf1758ce2b64172623e2e --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (36e4f5d): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 7.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 472.507s -> 473.739s (0.26%) |
} | ||
|
||
let mut checker = NormalizationChecker { tcx, instance }; | ||
if body.visit_with(&mut checker).is_break() { Err(NormalizationErrorInMono) } else { Ok(()) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description says
I first tried to check the whole MIR body's normalization and references_error. (As elaborate_drop handles normalization failure by returning ty::Error.)
It turns out that checking all Locals seems sufficient.
but this seems to check all types the visitor meets, not just locals?
Even then I wonder if this is enough, e.g. the types of struct fields can require further normalization (that might then overflow) even after the struct type itself has been normalized.
Fixes #92004
Fixes #92470
Fixes #95134
Fixes #105275
Fixes #105937
Fixes #117696-2
Fixes #118590
Fixes #122823
Fixes #131342
Fixes #139659
Analysis:
The causes of these issues are similar. They contain generic recursive functions that can be instantiated with different args infinitely at monomorphization stage.
Ideally this should be caught by the
check_recursion_limit
function. The reality is that normalization can reach recursion limit earlier than monomorphization's check because they calculate depths in different ways.Since normalization is called everywhere, ICEs appear in different locations.
Fix:
If we abort on overflow with
TypingMode::PostAnalysis
in the trait solver, it would also catch these errors.The main challenge is providing good diagnostics for them. So it's quite natural to put the check right before these normalization happening.
I first tried to check the whole MIR body's normalization and
references_error
. (As elaborate_drop handles normalization failure by returningty::Error
.)It turns out that checking all
Local
s seems sufficient.These types are gonna be normalized anyway. So with cache, these checks shouldn't be expensive.
This fixes these ICEs for both the next and old solver, though I'm not sure the change I made to the old solver is proper. Its overflow handling looks convoluted thus I didn't try to fix it more "upstream".