Improve cg_llvm codegen for simd_select_bitmask
#147119
Open
+42
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While testing out some intrinsic-based implementation in stdarch, I noticed that LLVM can't optimize it well if we use simple
bitcast
andtrunc
s to get the actuali1xN
s from the bitmasks. LLVM actually seems to like it better if we useshufflevector
s, and for autoupgraded intrinsics that used integer returns, but were migrated to usei1xN
returns, LLVM actually use this exact form (https://godbolt.org/z/YT99Phzj5)This doesn't affect any codegen that uses vectors that are multiples-of-8-length. This is mostly targeted towards 2- and 4-width vectors
@rustbot A-LLVM A-codegen T-compiler
r? codegen