-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permit the MIR inliner to inline diverging functions #106428
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit e4ddc8a0f6cac4ea10db08d70f4d15cfd3a5ecaf with merge 696235b229d3d1166bd68a49240f1d7880b2ff2d... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (696235b229d3d1166bd68a49240f1d7880b2ff2d): comparison URL. Overall result: ❌✅ regressions and improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
Which codegen test is this? |
This function in particular: rust/src/test/codegen/unchecked_shifts.rs Line 19 in 5c18bc6
|
Do you happen to have the error log (or new output) for that one?
Can at least answer that one: This is a |
Ah, found the file: unchecked_shifts.txt
|
@nikic I think that I don't have a strong opinion on how the LLVM-level UB-ness for that is preserved, just that it's still there. If it ending up producing a But I'm slightly concerned that this means that everthing using Would it perhaps be worth a mir-transform test that EDIT: Oh, wait, out-of-bounds shift is Aside: reason number 2947643 that I want |
9fd6e31
to
0dd6e02
Compare
I've rebased this up since #106294 is now merged. By default, the |
I could have sworn we ensure that we emit one Turning off MIR inlining also causes the |
Don't block this PR on that one assume. Do whatever you need for that codegen test; I can always update the implementation of Just so long as you're convinced that this additional inlining doesn't regress people using |
0dd6e02
to
f8d0b47
Compare
I looked for examples of other codegen in the standard library which depends on I'm going to try hunting for examples in the ecosystem. |
This comment has been minimized.
This comment has been minimized.
_2 = move _1; // scope 0 at $DIR/unwrap_unchecked.rs:+1:5: +1:8 | ||
StorageLive(_3); // scope 0 at $DIR/unwrap_unchecked.rs:+1:9: +1:27 | ||
_4 = discriminant(_2); // scope 1 at $SRC_DIR/core/src/option.rs:LL:COL | ||
switchInt(move _4) -> [0: bb1, 1: bb3, otherwise: bb2]; // scope 1 at $SRC_DIR/core/src/option.rs:LL:COL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As all other targets are unreachable, we could InstCombine this pattern into:
_5 = Eq(_4, const 1_size);
Assume(_5);
goto -> bb3
Would this help LLVM ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My instinct is that we should keep the "otherwise unreachable" part so LLVM knows that _4
is definitely 1
, and can optimize backwards using that.
Thus perhaps something like
switchInt(move _4) -> [0: bb1, 1: bb3, otherwise: bb2]; // scope 1 at $SRC_DIR/core/src/option.rs:LL:COL | |
switchInt(move _4) -> [1: bb3, otherwise: bb2]; // scope 1 at $SRC_DIR/core/src/option.rs:LL:COL |
might be best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that transformation is accomplished by DeduplicateBlocks, which I noted back here: #106428 (comment) though I'll admit I was a bit vague.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeduplicateBlocks
can be a heavy hammer, since it hashes basic blocks. Detecting unreachable blocks is trivial, since we only need to test the terminator. That's why I suggested modifying InstCombine
. SimplifyCfg
could do the trick too.
About the final solution: either the assume or the reduced SwitchInt, I don't mind, as long as LLVM understands.
f8d0b47
to
4b1a179
Compare
@bors try @rust-timer queue |
☀️ Test successful - checks-actions |
Finished benchmarking commit (2420bd3): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
The job Click to see the possible cause of the failure (guessed by this bot)
|
…illot Deduplicate unreachable blocks, for real this time In rust-lang#106428 (in particular rust-lang@41eda69) we noticed that inlining `unreachable_unchecked` can produce duplicate unreachable blocks. So we improved two MIR optimizations: `SimplifyCfg` was given a simplify to deduplicate unreachable blocks, then `InstCombine` was given a combiner to deduplicate switch targets that point at the same block. The problem is that change doesn't actually work. Our current pass order is ``` SimplifyCfg (does nothing relevant to this situation) Inline (produces multiple unreachable blocks) InstCombine (doesn't do anything here, oops) SimplifyCfg (produces the duplicate SwitchTargets that InstCombine is looking for) ``` So in here, I have factored out the specific function from `InstCombine` and placed it inside the simplify that produces the case it is looking for. This should ensure that it runs in the scenario it was designed for. Fixes rust-lang#110551 r? `@cjgillot`
Deduplicate unreachable blocks, for real this time In rust-lang/rust#106428 (in particular rust-lang/rust@41eda69) we noticed that inlining `unreachable_unchecked` can produce duplicate unreachable blocks. So we improved two MIR optimizations: `SimplifyCfg` was given a simplify to deduplicate unreachable blocks, then `InstCombine` was given a combiner to deduplicate switch targets that point at the same block. The problem is that change doesn't actually work. Our current pass order is ``` SimplifyCfg (does nothing relevant to this situation) Inline (produces multiple unreachable blocks) InstCombine (doesn't do anything here, oops) SimplifyCfg (produces the duplicate SwitchTargets that InstCombine is looking for) ``` So in here, I have factored out the specific function from `InstCombine` and placed it inside the simplify that produces the case it is looking for. This should ensure that it runs in the scenario it was designed for. Fixes rust-lang/rust#110551 r? `@cjgillot`
Deduplicate unreachable blocks, for real this time In rust-lang/rust#106428 (in particular rust-lang/rust@41eda69) we noticed that inlining `unreachable_unchecked` can produce duplicate unreachable blocks. So we improved two MIR optimizations: `SimplifyCfg` was given a simplify to deduplicate unreachable blocks, then `InstCombine` was given a combiner to deduplicate switch targets that point at the same block. The problem is that change doesn't actually work. Our current pass order is ``` SimplifyCfg (does nothing relevant to this situation) Inline (produces multiple unreachable blocks) InstCombine (doesn't do anything here, oops) SimplifyCfg (produces the duplicate SwitchTargets that InstCombine is looking for) ``` So in here, I have factored out the specific function from `InstCombine` and placed it inside the simplify that produces the case it is looking for. This should ensure that it runs in the scenario it was designed for. Fixes rust-lang/rust#110551 r? `@cjgillot`
This heuristic prevents inlining of
hint::unreachable_unchecked
, which in turn makesOption/Result::unwrap_unchecked
a bad inlining candidate. I looked through the changes tocore
,alloc
,std
, andhashbrown
by hand and they all seem reasonable. Let's see how this looks in perf...Based on rustc-perf it looks like this regresses ctfe-stress, and the cachegrind diff indicates that this regression is in
InterpCx::statement
. I don't know how to do any deeper analysis because that function is enormous in the try toolchain, which has no debuginfo in it. And a local build produces significantly different codegen for that function, even with LTO.