-
Notifications
You must be signed in to change notification settings - Fork 14k
Add alignment parameter to simd_masked_{load,store}
#147355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The Miri subtree was changed cc @rust-lang/miri Portable SIMD is developed in its own repository. If possible, consider making this change to rust-lang/portable-simd instead. cc @calebzulawski, @programmerjake Some changes occurred to the platform-builtins intrinsics. Make sure the cc @antoyo, @GuillaumeGomez, @bjorn3, @calebzulawski, @programmerjake Some changes occurred in compiler/rustc_codegen_ssa Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter |
This comment has been minimized.
This comment has been minimized.
tests/codegen-llvm/simd-intrinsic/simd-intrinsic-generic-masked-store.rs
Outdated
Show resolved
Hide resolved
|
If we only need normally-aligned and unaligned loads, IMO it'd be better to just have a const generic boolean indicating which of them we want for any particular operation. That avoids ad-hoc hacks such as const parameters in intrinsics. |
for portable-simd I think we should default to element-level-alignment since I expect that to be more efficient than unaligned ops on some targets (GPUs? maybe RISC-V V?) |
Yes, so...? IIUC we either want element-level alignment or no alignment, so we can just have a |
I thought you meant full-simd-type alignment or unaligned, since that's what x86 uses for simd instructions it calls aligned. |
|
This is about |
|
As a summary, we need 3 types of alignments
So a bool flag won't cut it, at best we can use a const generic parameter, with 0 meaning element size aligned (because that is the most used, and can't be specified using const generics (requires gce)) |
Now you are expanding the scope of the PR. So far the motivation has been, we'd like an unaligned version of the existing intrinsics. If you also want SIMD type aligned variants, the PR description needs to be expanded to argue for this. IIRC, last time this was looked into, the SIMD type alignment option wasn't necessary -- LLVM was more than able to use surrounding info on reference types to deduce the right alignment for the desired codegen. So please show some concrete undesirable codegen if you want to motivate a form of this intrinsic that requires SIMD type alignment. |
|
I apologise if I was unclear, but the motivation was always adding these 3 types of loads. LLVM will always generate an unaligned (byte-aligned) load/store if we pass any alignment less that the vector type size (because it is guaranteed to be safe). But for |
I don't think the "always" here is correct. If we are loading from an However, I guess stdarch uses raw pointers in its API. So yeah this definitely needs to be explained properly in the PR description, currently it is at best confusing. |
|
If we need 3 different alignment modes (Cc @Amanieu for the stdarch part here), that can still be done using const generics with a new 3-variant enum (similar to the enum we have for atomic memory access orderings). |
|
Yes, std::arch tries to reflect the type signatures used by the C vendor functions... it's not exactly just "bindgen for vendor functions", but it kinda is bindgen for vendor functions. |
|
I'm happy with the current API that takes a constant (either as an argument or a const generic). An enum doesn't really provide much of an advantage when the desired alignment can just be explicitly provided. |
|
If everyone agrees, I can substitute the const argument for a const-generic |
The enum provides the big advantage that we don't need more ad-hock "constant argument" hacks. What I was hoping to get from you is confirmation on which forms of the intrinsic are needed for stdarch. |
I would prefer that over the "constant argument" hack. Not sure if it's better than a 3-value enum but 🤷 . |
|
r? @RalfJung though feel free to reassigned |
This comment has been minimized.
This comment has been minimized.
|
☔ The latest upstream changes (presumably #148446) made this pull request unmergeable. Please resolve the merge conflicts. |
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@rustbot ready |
Rollup of 6 pull requests Successful merges: - #147355 (Add alignment parameter to `simd_masked_{load,store}`) - #147925 (Fix tests for big-endian) - #148341 (compiler: Fix a couple issues around cargo feature unification) - #148371 (Dogfood `trim_{suffix|prefix}` in compiler) - #148495 (Implement Path::is_empty) - #148502 (rustc-dev-guide subtree update) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 6 pull requests Successful merges: - #147355 (Add alignment parameter to `simd_masked_{load,store}`) - #147925 (Fix tests for big-endian) - #148341 (compiler: Fix a couple issues around cargo feature unification) - #148371 (Dogfood `trim_{suffix|prefix}` in compiler) - #148495 (Implement Path::is_empty) - #148502 (rustc-dev-guide subtree update) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #147355 - sayantn:masked-loads, r=RalfJung,bjorn3 Add alignment parameter to `simd_masked_{load,store}` This PR adds an alignment parameter in `simd_masked_load` and `simd_masked_store`, in the form of a const-generic enum `core::intrinsics::simd::SimdAlign`. This represents the alignment of the `ptr` argument in these intrinsics as follows - `SimdAlign::Unaligned` - `ptr` is unaligned/1-byte aligned - `SimdAlign::Element` - `ptr` is aligned to the element type of the SIMD vector (default behavior in the old signature) - `SimdAlign::Vector` - `ptr` is aligned to the SIMD vector type The main motive for this is stdarch - most vector loads are either fully aligned (to the vector size) or unaligned (byte-aligned), so the previous signature doesn't cut it. Now, stdarch will mostly use `SimdAlign::Unaligned` and `SimdAlign::Vector`, whereas portable-simd will use `SimdAlign::Element`. - [x] `cg_llvm` - [x] `cg_clif` - [x] `miri`/`const_eval` ## Alternatives Using a const-generic/"const" `u32` parameter as alignment (and we error during codegen if this argument is not a power of two). This, although more flexible than this, has a few drawbacks - If we use an const-generic argument, then portable-simd somehow needs to pass `align_of::<T>()` as the alignment, which isn't possible without GCE - "const" function parameters are just an ugly hack, and a pain to deal with in non-LLVM backends We can remedy the problem with the const-generic `u32` parameter by adding a special rule for the element alignment case (e.g. `0` can mean "use the alignment of the element type), but I feel like this is not as expressive as the enum approach, although I am open to suggestions cc `@workingjubilee` `@RalfJung` `@BoxyUwU`
Rollup of 6 pull requests Successful merges: - rust-lang/rust#147355 (Add alignment parameter to `simd_masked_{load,store}`) - rust-lang/rust#147925 (Fix tests for big-endian) - rust-lang/rust#148341 (compiler: Fix a couple issues around cargo feature unification) - rust-lang/rust#148371 (Dogfood `trim_{suffix|prefix}` in compiler) - rust-lang/rust#148495 (Implement Path::is_empty) - rust-lang/rust#148502 (rustc-dev-guide subtree update) r? `@ghost` `@rustbot` modify labels: rollup
| let memflags = match alignment { | ||
| SimdAlign::Unaligned => MemFlags::new().with_notrap(), | ||
| _ => MemFlags::trusted(), | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you accidentally added this to simd_gather rather than simd_masked_load.
Edit: Fixed in rust-lang/rustc_codegen_cranelift@a0b865d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh shoot, sorry for the inconvenience.
Rollup of 6 pull requests Successful merges: - rust-lang/rust#147355 (Add alignment parameter to `simd_masked_{load,store}`) - rust-lang/rust#147925 (Fix tests for big-endian) - rust-lang/rust#148341 (compiler: Fix a couple issues around cargo feature unification) - rust-lang/rust#148371 (Dogfood `trim_{suffix|prefix}` in compiler) - rust-lang/rust#148495 (Implement Path::is_empty) - rust-lang/rust#148502 (rustc-dev-guide subtree update) r? `@ghost` `@rustbot` modify labels: rollup
This PR adds an alignment parameter in
simd_masked_loadandsimd_masked_store, in the form of a const-generic enumcore::intrinsics::simd::SimdAlign. This represents the alignment of theptrargument in these intrinsics as followsSimdAlign::Unaligned-ptris unaligned/1-byte alignedSimdAlign::Element-ptris aligned to the element type of the SIMD vector (default behavior in the old signature)SimdAlign::Vector-ptris aligned to the SIMD vector typeThe main motive for this is stdarch - most vector loads are either fully aligned (to the vector size) or unaligned (byte-aligned), so the previous signature doesn't cut it.
Now, stdarch will mostly use
SimdAlign::UnalignedandSimdAlign::Vector, whereas portable-simd will useSimdAlign::Element.cg_llvmcg_clifmiri/const_evalAlternatives
Using a const-generic/"const"
u32parameter as alignment (and we error during codegen if this argument is not a power of two). This, although more flexible than this, has a few drawbacksalign_of::<T>()as the alignment, which isn't possible without GCEWe can remedy the problem with the const-generic
u32parameter by adding a special rule for the element alignment case (e.g.0can mean "use the alignment of the element type), but I feel like this is not as expressive as the enum approach, although I am open to suggestionscc @workingjubilee @RalfJung @BoxyUwU