-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checked_ilog: improve performance #115913
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @joshtriplett (or someone else) soon. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
b1a392a
to
00035b5
Compare
bc64e97
to
3de51c9
Compare
r? libs |
The implementation looks correct and we do have exhaustive tests for u16 so that part should be fine. And multiplication should be quite obviously better than division. But since you added benchmark sanyway can you add the before/after benchmark results to PR comment for future reference? |
I ran the benchmarks included in the fourth commit. Basically, for "before" I cherry picked just the fourth commit onto the master branch, and for the "after" I rebased the whole PR onto the master. Results below:
|
Looks great. Thanks. @bors r+ rollup |
checked_ilog: improve performance Addresses rust-lang#115874. (This PR replicates the original rust-lang#115875, which I accidentally closed by deleting my forked repository...)
💔 Test failed - checks-actions |
Looks like a flaky test in miri, other PRs have run into that too. @bors retry |
…llaumeGomez Rollup of 7 pull requests Successful merges: - rust-lang#115913 (checked_ilog: improve performance) - rust-lang#124178 ([cleanup] [llvm backend] Prevent creating the same `Instance::mono` multiple times) - rust-lang#124183 (Stop taking `ParamTy`/`ParamConst`/`EarlyParamRegion`/`AliasTy` by ref) - rust-lang#124217 (coverage: Prepare for improved branch coverage) - rust-lang#124230 (Stabilize generic `NonZero`.) - rust-lang#124252 (Improve ICE message for forbidden dep-graph reads.) - rust-lang#124268 (Update books) r? `@ghost` `@rustbot` modify labels: rollup
…llaumeGomez Rollup of 7 pull requests Successful merges: - rust-lang#115913 (checked_ilog: improve performance) - rust-lang#124178 ([cleanup] [llvm backend] Prevent creating the same `Instance::mono` multiple times) - rust-lang#124183 (Stop taking `ParamTy`/`ParamConst`/`EarlyParamRegion`/`AliasTy` by ref) - rust-lang#124217 (coverage: Prepare for improved branch coverage) - rust-lang#124230 (Stabilize generic `NonZero`.) - rust-lang#124252 (Improve ICE message for forbidden dep-graph reads.) - rust-lang#124268 (Update books) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#115913 - FedericoStra:checked_ilog, r=the8472 checked_ilog: improve performance Addresses rust-lang#115874. (This PR replicates the original rust-lang#115875, which I accidentally closed by deleting my forked repository...)
Unroll first iteration of checked_ilog loop This follows the optimization of rust-lang#115913. As shown in rust-lang#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, rust-lang#115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of rust-lang#115913. As shown in rust-lang#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, rust-lang#115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of rust-lang#115913. As shown in rust-lang#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, rust-lang#115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of rust-lang#115913. As shown in rust-lang#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, rust-lang#115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of rust-lang#115913. As shown in rust-lang#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, rust-lang#115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of #115913. As shown in rust-lang/rust#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, #115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of #115913. As shown in rust-lang/rust#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, #115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Unroll first iteration of checked_ilog loop This follows the optimization of #115913. As shown in rust-lang/rust#115913 (comment), the performance was improved in all important cases, but some regressions were introduced for the benchmarks `u32_log_random_small`, `u8_log_random` and `u8_log_random_small`. Basically, #115913 changed the implementation from one division per iteration to one multiplication per iteration plus one division. When there are zero iterations, this is a regression from zero divisions to one division. This PR avoids this by avoiding the division if we need zero iterations by returning `Some(0)` early. It also reduces the number of multiplications by one in all other cases.
Addresses #115874.
(This PR replicates the original #115875, which I accidentally closed by deleting my forked repository...)