-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize try_eval_bits to avoid layout queries #64673
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion |
Optimize match checking to avoid layout queries In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. We fallback to the (slower) previous code if that fails, so this is not a behavior change. r? @Centril who I believe knows this code enough, but if not feel free to re-assign
☀️ Try build successful - checks-azure |
Queued 9b2e58c with parent ed8b708, future comparison URL. |
Finished benchmarking try commit 9b2e58c, comparison URL. |
unicode_normalization is ~30% faster! 3 other benchea got ~2% slower. The rest is stable. |
511352c
to
9c62117
Compare
Okay, moved into the try_eval_bits function -- locally that shows that this unicode_normalization is a bit slower (~0.05s) after doing so, but that might be noise (although I get pretty consistent results). Let's get some official results though @bors try @rust-timer queue |
Awaiting bors try build completion |
Optimize match checking to avoid layout queries In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. We fallback to the (slower) previous code if that fails, so this is not a behavior change. r? @Centril who I believe knows this code enough, but if not feel free to re-assign
☀️ Try build successful - checks-azure |
Queued 64687bb with parent 4ff32c0, future comparison URL. |
Compilation of four crates failed during benchmarking:
|
Finished benchmarking try commit 64687bb, comparison URL. |
Failures are unrelated to this PR; they're due to newly introduced self-profile functionality for rustc. |
Looks like ~no difference which is good -- more general code is probably better. It's probably more likely this would've had a difference with const generics being used in the set of crate benchmarks. @oli-obk I believe this should be ready to merge. |
@bors r+ |
📌 Commit 9c62117 has been approved by |
@bors rollup=never |
@bors p=3 |
⌛ Testing commit 9c62117 with merge eea282a8233db92be72b4de43c575023a00f07f8... |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
💔 Test failed - checks-azure |
This specifically targets match checking, but is possibly more widely useful as well. In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. It may be worth considering adding a `size_of` query in the future which might be far faster, especially if specialized for "const" cases -- match arms being the most obvious example. It's possibly such a function would benefit from *not* being a query as well, since it's trivially evaluatable from the sty for many cases whereas a query needs to hash the input and such.
9c62117
to
06c6e75
Compare
@bors r=oli-obk |
📌 Commit 06c6e75 has been approved by |
Optimize try_eval_bits to avoid layout queries This specifically targets match checking, but is possibly more widely useful as well. In code with large, single-value match statements, we were previously spending a lot of time running layout_of for the primitive types (integers, chars) -- which is essentially useless. This optimizes the code to avoid those query calls by directly obtaining the size for these types, when possible. It may be worth considering adding a `size_of` query in the future which might be far faster, especially if specialized for "const" cases -- match arms being the most obvious example. It's possibly such a function would benefit from *not* being a query as well, since it's trivially evaluatable from the sty for many cases whereas a query needs to hash the input and such.
☀️ Test successful - checks-azure |
The `if let Some(val) = value.try_eval_bits(...)` branch in `from_const()` is very hot for the `unicode_normalization` benchmark. This commit introduces a special-case alternative for scalars that avoids `try_eval_bits()` and all the functions it calls (`Const::eval()`, `ConstValue::try_to_bits()`, `ConstValue::try_to_scalar()`, and `Scalar::to_bits()`), instead extracting the result immediately. The type and value checking done by `Scalar::to_bits()` is replicated by moving it into a new function `Scalar::check_raw()` and using that new function in the special case. PR rust-lang#64673 introduced some special-case handling of scalar types in `Const::try_eval_bits()`. This handling is now moved out of that function into the new `IntRange::integral_size_and_signed_bias` function. This commit reduces the instruction count for `unicode_normalization-check-clean` by about 10%.
This specifically targets match checking, but is possibly more widely
useful as well. In code with large, single-value match statements, we
were previously spending a lot of time running layout_of for the
primitive types (integers, chars) -- which is essentially useless. This
optimizes the code to avoid those query calls by directly obtaining the
size for these types, when possible.
It may be worth considering adding a
size_of
query in the future whichmight be far faster, especially if specialized for "const" cases --
match arms being the most obvious example. It's possibly such a function
would benefit from not being a query as well, since it's trivially
evaluatable from the sty for many cases whereas a query needs to hash
the input and such.