-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use try_fold and try_rfold in default implementations of fold and rfold #106463
Conversation
r? @m-ou-se (rustbot has picked a reviewer for you, use r? to override) |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
Specialized code can sometimes do better in Additionally having the default implementation go through a layer of indirection is likely going to produce more IR and make debug builds slower. Before starting a perf run to check I suggest using NeverShortCircuit which encodes the property that it never branches in the type. |
Oh huh, I didn't realise that type existed. I'll use that instead. |
I've wanted this for ages -- last I tried was #90886 -- but it was removed in the past for compiler perf reasons. ( Let's see what perf has to say this year: |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 23aeb351d1201823fa1943b428a7774610f58496 with merge b7a648a967f63ef1ec80554b526a613a0c30663a... |
accum = f(accum, x); | ||
} | ||
accum | ||
self.try_rfold(init, |accum, item| NeverShortCircuit(f(accum, item))).0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno if it'd impact perf, but there used to be a separate method for wrapping a function in NeverShortCircuit
:
rust/library/core/src/ops/try_trait.rs
Line 463 in 0b2f717
pub fn wrap_mut_2<A, B>(mut f: impl FnMut(A, B) -> T) -> impl FnMut(A, B) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat, although honestly I think that it's not super worth having this since the alternative is so short. I feel like the current code looks a bit clearer than:
self.try_rfold(init, NeverShortCircuit.wrap_mut_2(f)).0
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (b7a648a967f63ef1ec80554b526a613a0c30663a): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
I merged in the latest master and chose to move the function inside the closure like the function you linked, @scottmcm. I dunno if folks are willing to just lend the resources of the benchmark builder to mess around with this a bit more, but right now I feel like the main choices are:
Mostly @ ing you even though you're not on the libs team since you were enthusiastic about this, so feel free to defer to someone else if you think they should make the call instead. |
Dunno if it matters, but it's perfectly reasonable to find out: |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit a444e7073fee53a59e5aa7fafbb43ab656b7aa34 with merge ca8c3e728f0ecf32ede2208caa30a6cd8013abe0... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (ca8c3e728f0ecf32ede2208caa30a6cd8013abe0): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
So, it does seem to actually make a difference switching to |
Okay, looking deeper into the actual benchmarks that regressed, the places I can find folds being used in those crates are mostly very simple ones-- stuff like summing up an iterator, concatenating strings, My gut feeling is that we're not there yet with the current |
a444e70
to
2514d02
Compare
Let's see if anything's changed: |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 2514d02 with merge 17f07ebd5a676f394244d27d433376451f14ab97... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (17f07ebd5a676f394244d27d433376451f14ab97): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
Yeah, this is definitely getting worse. I'm going to just close this for now and I guess this can be a future improvement further down the line. |
This means that specialised versions will only need to implement the try variant and will get a specialised non-try variant for free. That is, once the
Try
trait and co. are stable.This can't add any extra branches due to the use of
NeverShortCircuit
, but the actual perf implications are still unclear.