-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make [u8]::cmp implementation branchless #93962
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
Also shaves off a few instructions on short arrays. https://godbolt.org/z/hGzb5aKPq |
@rustbot label T-libs |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 71e00a43760178698ab4431670de13ea0cfbfcee with merge 582fba747804a832de267609ab4837f67d268f3a... |
☀️ Try build successful - checks-actions |
Queued 582fba747804a832de267609ab4837f67d268f3a with parent 52dd59e, future comparison URL. |
Finished benchmarking commit (582fba747804a832de267609ab4837f67d268f3a): comparison url. Summary: This benchmark run did not return any relevant results. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. @bors rollup=never |
r=me with commits squashed |
71e00a4
to
3960ce6
Compare
@bors r+ rollup |
📌 Commit 3960ce6 has been approved by |
…-Simulacrum Make [u8]::cmp implementation branchless The current implementation generates rather ugly assembly code, branching when the common parts are equal. By performing the comparison of the lengths upfront using a subtraction, the assembly gets much prettier: https://godbolt.org/z/4e5fnEKGd. This will probably not impact speed too much, as the expensive part is in most cases the `memcmp`, but it sure looks better (I'm porting a sorting algorithm currently, and that branch just bothered me).
…askrgr Rollup of 10 pull requests Successful merges: - rust-lang#92366 (Resolve concern of `derive_default_enum`) - rust-lang#93382 (Add a bit more padding in search box) - rust-lang#93962 (Make [u8]::cmp implementation branchless) - rust-lang#94015 (rustdoc --check option documentation) - rust-lang#94017 (Clarify confusing UB statement in MIR) - rust-lang#94020 (Support pretty printing of invalid constants) - rust-lang#94027 (Update browser UI test version) - rust-lang#94037 (Fix inconsistent symbol mangling with -Zverbose) - rust-lang#94045 (Update books) - rust-lang#94054 (:arrow_up: rust-analyzer) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
The current implementation generates rather ugly assembly code, branching when the common parts are equal. By performing the comparison of the lengths upfront using a subtraction, the assembly gets much prettier: https://godbolt.org/z/4e5fnEKGd.
This will probably not impact speed too much, as the expensive part is in most cases the
memcmp
, but it sure looks better (I'm porting a sorting algorithm currently, and that branch just bothered me).