-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Improve the generic MIR in the default PartialOrd::le
and friends
#137904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rustbot has assigned @workingjubilee. Use |
_12 = copy ((_11 as Some).0: std::cmp::Ordering); | ||
StorageLive(_13); | ||
_13 = discriminant(_12); | ||
_0 = Le(move _13, const 0_i8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this implementation for is_le
means that we get an Le
in the MIR (and similarly is_gt
gives a Gt
, etc)
The previous matches!
implementation meant that this was
_14 = Eq(copy _13, const 1_i8);
_0 = Not(move _14);
which is much less obviously a <=
.
(That should also make these look like what you'd get in C++ for things like (a <=> b) <= 0
, https://en.cppreference.com/w/cpp/utility/compare/strong_ordering#Comparisons, and thus make LLVM more likely to continue getting them right since it's a code pattern that clang uses too:
https://github.com/llvm/llvm-project/blob/60486292b79885b7800b082754153202bef5b1f0/libcxx/include/__compare/is_eq.h#L26 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. Do we have a MIR transform that would normalize that...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because we don't have an InstCombine system.
GVN can't do it for two reasons:
_11
has two initializers, so it's ignored by allSsaLocals
-based transforms.- There's no
_n = Le(…, …)
in the program, and GVN can only replace things with locals or constants, not new never-seen-beforeRvalue
s.
EDIT: err, I replied about making a Le
, but you probably meant why we can't it can't be _0 = Ne(copy _13, const 1_i8)
. So only the latter reason applies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
4c41a72
to
fcb91b8
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
_13 = (copy _12,); | ||
_0 = <fn(std::cmp::Ordering) -> bool {std::cmp::Ordering::is_le} as FnOnce<(std::cmp::Ordering,)>>::call_once(std::cmp::Ordering::is_le, move _13) -> [return: bb4, unwind unreachable]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops.
_12 = copy ((_11 as Some).0: std::cmp::Ordering); | ||
StorageLive(_13); | ||
_13 = discriminant(_12); | ||
_0 = Le(move _13, const 0_i8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. Do we have a MIR transform that would normalize that...?
weird. |
@bors r- pls do not r+ a try build |
Wait until rust-lang-ci@5c944f0 is done |
Also probably wait until perf is done too? Doesn't seem like there's a good reason to merge something that attempts to fix perf without knowing if it actually fixes perf? |
@compiler-errors Oh, sorry, didn't see the try. |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (5c944f0): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 0.8%, secondary -3.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 773.437s -> 771.311s (-0.27%) |
hm. the regression specifically on cargo seems odd to me. |
This looks like it's basically undoing the changes from #137796 (comment), as expected. Cargo improved in that one, worsened in this one. Cranelift, nalgebra, and webrender all regressed in doc in that one, then improved in this one. But because this affects mir it affects inlining, so I think the cargo change is mostly in churn -- the codegen schedule is changed up, not just scaled by 1%. You can also see the effect of the better generic MIR for this in the bunch of debug cases that get smaller (by a very small amount) in binary size, without other perf impact to those cases. So unless you're opposed, @workingjubilee, I'm inclined to merge under #137904 (comment) |
No icount change in the rollup, as expected, but something was still a size change, so I'm curious @rust-timer build e87d9fc |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (e87d9fc): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -0.5%, secondary -3.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResults (primary 0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 765.289s -> 764.761s (-0.07%) |
It looks like I regressed this accidentally in #137197 due to #137901
So this PR does two things:
Tweaks the way we're callingThanks to some quick work from C-E this became no longer necessary.is_some_and
so that it optimizes in the generic MIR (rather than needing to optimize it in every monomorphization) -- the first commit adds a MIR test, so you can see the difference in the second commit.is_le
and friends to be slightly simpler, and parallel how clang does them.