-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Missed optimization with autovectorized saturating truncation (clamp) #104875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
hi, I would like to take this |
@braw-lee Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting a patch. Good luck! |
this https://alive2.llvm.org/ce/z/pRRVhU proof is specific for above case i.e range between 0 and 255, and conversion from i32 to i8 i don't think we should be performing instruction combining for hard-coded values |
|
we already have this optimization for constant values but the thing to notice here is we are using so do we need to fix llvm or rust codegen to get this optimization for constant values? @dtcxzyw but we do need to add this optimization where clamp ranges are variables and (min < max), |
The umin is produced by LLVM not Rust, because both arguments are known non-negative. I believe the pattern you want to handle is https://alive2.llvm.org/ce/z/qc9GCG (with constant min/max). And then as a next step the variation where there is an additional |
i don't get it how both args are known non-negative here, what if the |
@braw-lee The non-negative inputs here are 0 (in the icmp and select) and 255, not %input. |
@nikic so currently what we do for a pattern like this
is we take smax of LHS and RHS so i tried to use this code to handle case when we have signed comparision and unsigned min/max
but this is incorrect according to alivetv correct transformation would be these
buts its difficult to generate the correct transformations using the current code(creates min/max intrinsic around LHS and RHS) |
In Rust, trying to clamp and truncate from a slice of
i32
to a slice ofu8
using the standard library's clamp function produces more instructions than manually clamping withmax
andmin
.https://rust.godbolt.org/z/zf73jsqjq
Assembly instructions
Clamp
Manual clamp
Emitted IR - https://alive2.llvm.org/ce/z/hbU88w
Clamp
Manual clamp
The standard library clamp is implemented as the following.
It seems OK to transform the standard library clamp to the manual clamp, minimized IR.
alive2 proof - https://alive2.llvm.org/ce/z/pRRVhU
Rust source - https://rust.godbolt.org/z/3PxW1xWqo
Real world examples from functions in the
image-webp
cratehttps://rust.godbolt.org/z/veGzv1dPx - source
https://rust.godbolt.org/z/sf4v6ceGM - source
Originally reported rust-lang/rust#125738
The text was updated successfully, but these errors were encountered: