-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove manual unrolling from slice::Iter(Mut)::try_fold #64600
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion |
[DO NOT MERGE] Experiment with removing unrolling from slice::Iter::try_fold For context see #64572 (comment) r? @scottmcm
☀️ Try build successful - checks-azure |
Queued dd115ba with parent eceec57, future comparison URL. |
Finished benchmarking try commit dd115ba, comparison URL. |
This change gets roughly half the improvements that the commit in #64572 gets. |
I think that unrolling would eventually have to go and be removed from libcore, I was just hoping the compiler would catch up and be able to unroll loops with multiple exits itself. Unrolling should ideally belong to the compiler, so it can make the decision about when to duplicate code. I haven't revisited that, so for all I know llvm could have learned this by now. [Edit: checked -- rustc nightly does not unroll such things by itself right now either. I wonder if this multiple exit improvement means that things are on the way..? No clue] This seems like a situation where it's easy to find both good and bad cases. Things like scans through bytes for parsing with a simple predicate benefit a lot from unrolling in all/find etc. |
While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like rust-lang#64545).
38d8c8d
to
92e91f7
Compare
Ok, it seems like the inclination is that we should do this so I've turned this into a "real" PR. I do think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually, though I certainly with LLVM was better at these cases. I'm not sure who should approve this -- does it need |
Couldn't you just remove the |
572de05
to
2f7b32a
Compare
This comment has been minimized.
This comment has been minimized.
2f7b32a
to
6ac64ab
Compare
@bors try @rust-timer queue (I'm curious to see the new self-profile results, and want to make sure that removing the overrides still keeps the gain here -- it might mean more work to eliminate |
Awaiting bors try build completion |
Remove manual unrolling from slice::Iter(Mut)::try_fold While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545). --- For context see #64572 (comment)
☀️ Try build successful - checks-azure |
Queued 8be3622 with parent 66bf391, future comparison URL. |
Finished benchmarking try commit 8be3622, comparison URL. |
|
Oh, interesting. |
New perf link (thank you, Mark-Simulacrum!) with self-profile results for both sides: It looks like nearly all of the speedup for And for |
This change looks good to me, but I guess we are waiting for some discussion. I'll try to ask @Geal about nom performance and unrolling. You know how much I would like to say we can just reimplement important stuff, like an unrolling slice iterator, outside libcore, but the libcore version is still tied up with unstable features like |
@bluss no issue for me, nom does not use |
It looks like it's not impossible for rustc to unroll an "Iterator::all" like loop. It just can't do it in the simplest forms that those loops take, for example not in I have some old alternative slice iterator code, and it can be automatically unrolled by the compiler. The code is here (github link to iter.rs) and there are benchmarks that show the unrolling on that specific branch. I haven't managed to reduce the loop that will unroll, though — maybe it's specific to the code in the benchmark? The compiler's unroll disappears if the special case |
@bluss do you still have r+ here, or do I need to find a different reviewer for this? |
@scottmcm I guess I do, but with the nominated tag I thought we were waiting for the libs team |
@bors r+ rollup=never |
📌 Commit 6ac64ab has been approved by |
Remove manual unrolling from slice::Iter(Mut)::try_fold While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545). --- For context see #64572 (comment)
☀️ Test successful - checks-azure |
While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone.
I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545).
Final perf comparison: #64600 (comment)
For context see #64572 (comment)