Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use try_fold and try_rfold in default implementations of fold and rfold #106463

Closed
wants to merge 1 commit into from

Conversation

clarfonthey
Copy link
Contributor

@clarfonthey clarfonthey commented Jan 4, 2023

This means that specialised versions will only need to implement the try variant and will get a specialised non-try variant for free. That is, once the Try trait and co. are stable.

This can't add any extra branches due to the use of NeverShortCircuit, but the actual perf implications are still unclear.

@rustbot
Copy link
Collaborator

rustbot commented Jan 4, 2023

r? @m-ou-se

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 4, 2023
@rustbot
Copy link
Collaborator

rustbot commented Jan 4, 2023

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@the8472
Copy link
Member

the8472 commented Jan 5, 2023

This means that specialised versions will only need to implement the try variant

Specialized code can sometimes do better in fold than try_fold because it takes ownership of self and can thus forego updating fields in self and such things.

Additionally having the default implementation go through a layer of indirection is likely going to produce more IR and make debug builds slower.

Before starting a perf run to check I suggest using NeverShortCircuit which encodes the property that it never branches in the type.

@clarfonthey
Copy link
Contributor Author

Oh huh, I didn't realise that type existed. I'll use that instead.

@scottmcm
Copy link
Member

I've wanted this for ages -- last I tried was #90886 -- but it was removed in the past for compiler perf reasons. (try_fold still can't be implemented on stable, so it's not a huge problem yet, though personally I consider this change as something necessary before we stabilize implementing try_fold.) With something like rustc_must_implement_one_of we'll eventually be able to write just try_fold and have a full iterator.

Let's see what perf has to say this year:
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 14, 2023
@bors
Copy link
Contributor

bors commented Jan 14, 2023

⌛ Trying commit 23aeb351d1201823fa1943b428a7774610f58496 with merge b7a648a967f63ef1ec80554b526a613a0c30663a...

accum = f(accum, x);
}
accum
self.try_rfold(init, |accum, item| NeverShortCircuit(f(accum, item))).0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno if it'd impact perf, but there used to be a separate method for wrapping a function in NeverShortCircuit:

pub fn wrap_mut_2<A, B>(mut f: impl FnMut(A, B) -> T) -> impl FnMut(A, B) -> Self {

Copy link
Contributor Author

@clarfonthey clarfonthey Jan 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat, although honestly I think that it's not super worth having this since the alternative is so short. I feel like the current code looks a bit clearer than:

self.try_rfold(init, NeverShortCircuit.wrap_mut_2(f)).0

@bors
Copy link
Contributor

bors commented Jan 15, 2023

☀️ Try build successful - checks-actions
Build commit: b7a648a967f63ef1ec80554b526a613a0c30663a (b7a648a967f63ef1ec80554b526a613a0c30663a)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b7a648a967f63ef1ec80554b526a613a0c30663a): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.2%, 1.2%] 20
Regressions ❌
(secondary)
1.3% [0.4%, 2.2%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [0.2%, 1.2%] 20

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.8% [0.6%, 5.7%] 7
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.8% [0.6%, 5.7%] 7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.2% [1.2%, 1.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.2% [1.2%, 1.2%] 1

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 15, 2023
@clarfonthey
Copy link
Contributor Author

I merged in the latest master and chose to move the function inside the closure like the function you linked, @scottmcm. I dunno if folks are willing to just lend the resources of the benchmark builder to mess around with this a bit more, but right now I feel like the main choices are:

  1. Accept the perf penalty (I have a feeling many would object)
  2. Lend the resources to mess with the definition a bit. (I'm fine with messing with the code, but only if folks are comfortable with using the build resources. I also don't know of a good way to check the actual outputs the builds are comparing, but would be willing to investigate that too.)
  3. Just close this and wait until later.

Mostly @ ing you even though you're not on the libs team since you were enthusiastic about this, so feel free to defer to someone else if you think they should make the call instead.

@scottmcm
Copy link
Member

Dunno if it matters, but it's perfectly reasonable to find out:
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 18, 2023
@bors
Copy link
Contributor

bors commented Jan 18, 2023

⌛ Trying commit a444e7073fee53a59e5aa7fafbb43ab656b7aa34 with merge ca8c3e728f0ecf32ede2208caa30a6cd8013abe0...

@bors
Copy link
Contributor

bors commented Jan 18, 2023

☀️ Try build successful - checks-actions
Build commit: ca8c3e728f0ecf32ede2208caa30a6cd8013abe0 (ca8c3e728f0ecf32ede2208caa30a6cd8013abe0)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ca8c3e728f0ecf32ede2208caa30a6cd8013abe0): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.2%, 1.2%] 20
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.3% [-1.3%, -1.3%] 1
All ❌✅ (primary) 0.6% [0.2%, 1.2%] 20

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.0% [0.6%, 9.4%] 6
Regressions ❌
(secondary)
4.8% [4.8%, 4.8%] 1
Improvements ✅
(primary)
-2.3% [-2.3%, -2.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.3% [-2.3%, 9.4%] 7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.2% [1.0%, 1.5%] 6
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.2% [1.0%, 1.5%] 6

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 18, 2023
@clarfonthey
Copy link
Contributor Author

So, it does seem to actually make a difference switching to move, but it looks like the main regression is still there, which is in the syn crate. Might poke around a bit to see what exactly is causing that.

@clarfonthey
Copy link
Contributor Author

Okay, looking deeper into the actual benchmarks that regressed, the places I can find folds being used in those crates are mostly very simple ones-- stuff like summing up an iterator, concatenating strings, quote!(#x + #y), etc.

My gut feeling is that we're not there yet with the current Try work, at least enough to ensure there's no performance penalty here, but if someone more clever wants to step in and object to my findings, feel free.

@scottmcm
Copy link
Member

Let's see if anything's changed:
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 22, 2023
@bors
Copy link
Contributor

bors commented Mar 22, 2023

⌛ Trying commit 2514d02 with merge 17f07ebd5a676f394244d27d433376451f14ab97...

@bors
Copy link
Contributor

bors commented Mar 22, 2023

☀️ Try build successful - checks-actions
Build commit: 17f07ebd5a676f394244d27d433376451f14ab97 (17f07ebd5a676f394244d27d433376451f14ab97)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (17f07ebd5a676f394244d27d433376451f14ab97): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.4%, 1.4%] 13
Regressions ❌
(secondary)
0.8% [0.8%, 0.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.9% [0.4%, 1.4%] 13

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.4% [0.8%, 4.0%] 2
Regressions ❌
(secondary)
0.6% [0.4%, 0.8%] 2
Improvements ✅
(primary)
-3.0% [-3.7%, -2.3%] 2
Improvements ✅
(secondary)
-1.6% [-1.6%, -1.6%] 1
All ❌✅ (primary) -0.3% [-3.7%, 4.0%] 4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [0.9%, 1.1%] 3
Regressions ❌
(secondary)
0.5% [0.5%, 0.6%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.5% [-0.5%, -0.5%] 1
All ❌✅ (primary) 1.0% [0.9%, 1.1%] 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 22, 2023
@clarfonthey
Copy link
Contributor Author

Yeah, this is definitely getting worse. I'm going to just close this for now and I guess this can be a future improvement further down the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants