-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal code generated for alignment checks and similar via number.trailing_zeros() >= bit_count
#107554
Comments
https://rust-lang.github.io/rust-clippy/master/#verbose_bit_mask Clippy does at least list it as a possible downside of the lint. |
This was fixed by https://reviews.llvm.org/D143368. |
Thanks. Any ETA estimate when it will land in Rust compiler (TBH I have no idea how and how often llvm is pulled)? I just tested with nightly and the code generated is still suboptimal. |
Sorry, I also don't know when on Rust side. (I guess hopefully LLVM release would be in few weeks, as you may know https://github.com/llvm/llvm-project/milestone/20.) |
This will land in Rust nightly around August with the LLVM 17 upgrade. Assigning to myself to check back when it's time. |
Godbolt: https://godbolt.org/z/KP7s3vjTn Fixed by #114048, needs codegen test. |
… r=nikic add codegen test for `trailing_zeros` comparison This PR add codegen test for rust-lang#107554 (comment) Fixes rust-lang#107554.
… r=nikic add codegen test for `trailing_zeros` comparison This PR add codegen test for rust-lang#107554 (comment) Fixes rust-lang#107554.
add codegen test for `trailing_zeros` comparison This PR add codegen test for rust-lang/rust#107554 (comment) Fixes #107554.
Typically, one would write
val & 7 == 0
to check whetherval
is aligned to 8B. However, Clippy complains and says it would be nicer to write it asval.trailing_zeros() >= 3
. Although it is disputable whether this is really more readable, the problem is that the code generated is significantly worse.For example, let's take this code:
I expected to see the same optimal code generated. However, the compiler indeed generates separate instruction for
trailing_zeros()
instruction and additional compare, instead of a single instruction.Code generated on x64:
Code generated on ARM:
This happens with the newest Rust 1.67 as well as with older versions and in nightly.
Checking of
trailing_zeros
/trailing_ones
andleading_zeros
/leading_ones
with>
/>=
operators againstn
can be mapped to checking via a mask ofn+1
/n
ones at the tail (fortrailing_*
) or head (forleading_*
) of the mask word and comparing against 0 for*_zeroes
(which is implicitly done and set as ZERO/EQ flag in CPU flags after the TEST operation, i.e., it boils down to a single instruction) or the mask word for*_ones
(which boils down to two instructions).The text was updated successfully, but these errors were encountered: