-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for WebAssembly SIMD support #74372
Comments
This commit updates the src/stdarch submodule primarily to include rust-lang/stdarch#874 which updated and revamped WebAssembly SIMD intrinsics and renamed WebAssembly atomics intrinsics. This is all unstable surface area of the standard library so the changes should be ok here. The SIMD updates also enable SIMD intrinsics to be used by any program any any time, yay! cc rust-lang#74372, a tracking issue I've opened for the stabilization of SIMD intrinsics
…kruppe Update stdarch submodule This commit updates the src/stdarch submodule primarily to include rust-lang/stdarch#874 which updated and revamped WebAssembly SIMD intrinsics and renamed WebAssembly atomics intrinsics. This is all unstable surface area of the standard library so the changes should be ok here. The SIMD updates also enable SIMD intrinsics to be used by any program any any time, yay! cc rust-lang#74372, a tracking issue I've opened for the stabilization of SIMD intrinsics
Is there a way to enable wasm SIMD globally so that it can be adopted by the autovectorizer? Emscripten seems to have such a feature: https://emscripten.org/docs/porting/simd.html
|
Enabling |
They really should, it doesn't seem like the compiler can reason about it... though maybe that is because the compiler doesn't even inline any of the intrinsics at all, which seems like a decently big problem: |
A lot of the Examples:
I would keep |
I agree For I've been adding most of the intrinsics so far but unfortunately I haven't had the opportunity to write a compiled-to-wasm thing which actually uses the instructions. In that sense I'm mostly shooting in the dark as to what the APIs should be. So far I've tried to maximize availability (ensuring there's a function-per-instruction, regardless of how silly it is to have) and stick close to the most standardized piece, the spec names/type signatures. I did this to stay in the spirit of the x86 intrinsics which are copied verbatim from the spec and we provide virtually no abstractions to make them nice to use (even in some cases where it would be easy to do so). Overall, though, I'm not sure if this is the best tradeoff for wasm. WebAssembly isn't the same as x86 in this regard, so we may get more bang for our buck by putting more thought and effort into what a usable API would be (at the cost of time for stabilization, of course). |
tl;dr; I'd like to ask if a @rust-lang/libs team member would be willing to Ok I think enough things have landed now that I'd like to propose that this To recap, this tracking proposal is for The The design princicples for the SIMD intrinsics in the
It's worth nothing that Clang has a header file for WebAssembly
The intention behind these design decisions is that all the functionality of the Implementation StatusIt's worth touching on the implementation status of this proposal currently as It's also worth mentioning that this is all a relatively new feature in LLVM. |
I'm still somewhat skeptical of those being the names for the |
There was a small bikeshed here about those names where I don't think, though, that this prevents us from having dedicated types for each lane width because functions go into the value namespace and structs go into the type namespace. |
Seems like it prevents the type from being a tuple struct though: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40b1e2abd2ade54dd598db826ed594bf But as long as using a non-tuple struct for those types doesn't cause any problems such as repr(simd) or so not working anymore, then I guess it's fine. |
I think it's appropriate to start checking for consensus here, and we can use concerns to track blockers. @rfcbot merge stdarch needs an update to a version matching the proposal: I'd love to see |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
@alexcrichton Amazing work! I think this all pretty much LGTM but I do have a few questions: Firstly, it looks like the target feature name is just Secondly, with respect to safe target feature, I see this in the docs today:
If it does get dynamic detection, does that change the discussion around a safe target feature? Thirdly and lastly, have we considered using a naming scheme that matches the WASM spec as closely as possible? e.g., We invent names where necessary, but otherwise do a mechanical translation of things (e.g., |
Good questions and thanks for reading over this! The name Currently WebAssembly doesn't have any form of dynamic detectino for supported features, and even on the horizon I'm not sure there's really anything viable for adding this in a form that looks like x86. The closest equivalent for WebAssembly is conditional sections but afaik that proposal is sort of dead in the water right now and doesn't have a future. There are possible alternatives of a roughly similar shape, but it's all about selection at compile time instead of runtime. Basically I think it's tough to answer precisely what would happen if wasm gets dynamic detection because it's unclear to me what it means to get dynamic detection. My guess, though, is that nothing will ever be For naming, that is indeed an alternative! I actually originally implemented that. I ended up coming around to Rust-specific names for a few reasons though:
I originally thought 1:1 with instructions would be the way to go. After using it a bit, though, and getting other feedback, I've personally come to the conclusion that the best option is to design the names specifically for Rust. Given the size of the instruction set I think this is feasible (unlike x86) and given the downsides of matching wasm exactly I think we have a lot to gain in terms of readability of code itself. What I don't necessarily have a great gauge on is how folks will typically write SIMD-accelerated wasm code. If they start from desired instructions and work backwards I think renaming things could impede this. If they start from the That's at least my thinking on the names so far. I don't really mind implementing either way, and I don't personally lean super strongly either way. I'm just currently leaning more towards custom names from Rust. (it's worth pointing out the custom Rust names more closely match what C is doing currently in clang as well, which is to diverge from the spec and use nicer prefixes and have conveniences) |
RE RE dynamic detection: I think if you're certain that the intrinsics won't become unsafe to call some day, then I think that assuages the root of my concern there. RE naming:
I think I'm with you there, and personally, absent other data, I'm inclined to defer to your experience actually using the routines. I think we're in agreement about the trade offs here, and in particular, that having precisely matching names is a nice property. But perhaps the only reason I appreciate it so much for x86 is indeed because of its size, which doesn't apply here. And if the C folks are devising their own names too, then I guess we might as well too. One nice thing to do (not a stabilization blocker of course) would be to add the WASM spec name instruction to the docs of each intrinsic and add a doc alias for them. I think that might actually get us the best of both worlds. Then folks could type in the WASM spec name and get back the Rust equivalent. |
re: re: syntax, name-matching, etc: re: "Is this ready?" |
Should the bitmask instructions possibly return u8 / u16 / u32 based on the lane count? At the moment they all return i32 which is of course the type on "machine level", but since the intrinsics all take appropriate Rust integer types, it seems like the bitmask instructions don't properly reflect that. I noticed this because actually casting to u16 on a Oh, also I've started implementing a bunch of WASM SIMD for various crates |
Returning a smaller type makes sense to me, I can work on implementing that. You're thinking of For the naming this was indeed one aspect I was worried about. I was worried that there was no Also thanks for implementing these optimizations! Do you have some PRs to show as examples of ergonomics and such? |
This commit updates the compiler's handling of the `#[target_feature]` attribute when applied to functions on WebAssembly-based targets. The compiler in general requires that any functions with `#[target_feature]` are marked as `unsafe` as well, but this commit relaxes the restriction for WebAssembly targets where the attribute can be applied to safe functions as well. The reason this is done is that the motivation for this feature of the compiler is not applicable for WebAssembly targets. In general the `#[target_feature]` attribute is used to enhance target CPU features enabled beyond the basic level for the rest of the compilation. If done improperly this means that your program could execute an instruction that the CPU you happen to be running on does not understand. This is considered undefined behavior where it is unknown what will happen (e.g. it's not a deterministic `SIGILL`). For WebAssembly, however, the target is different. It is not possible for a running WebAssembly program to execute an instruction that the engine does not understand. If this were the case then the program would not have validated in the first place and would not run at all. Even if this were allowed in some hypothetical future where engines have some form of runtime feature detection (which they do not right now) any implementation of such a feature would generate a trap if a module attempts to execute an instruction the module does not understand. This deterministic trap behavior would still not fall into the category of undefined behavior because the trap is deterministic. For these reasons the `#[target_feature]` attribute is now allowed on safe functions, but only for WebAssembly targets. This notably enables the wasm-SIMD intrinsics proposed for stabilization in rust-lang#74372 to be marked as safe generally instead of today where they're all `unsafe` due to the historical implementation of `#[target_feature]` in the compiler.
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
…sm, r=joshtriplett rustc: Allow safe #[target_feature] on wasm This commit updates the compiler's handling of the `#[target_feature]` attribute when applied to functions on WebAssembly-based targets. The compiler in general requires that any functions with `#[target_feature]` are marked as `unsafe` as well, but this commit relaxes the restriction for WebAssembly targets where the attribute can be applied to safe functions as well. The reason this is done is that the motivation for this feature of the compiler is not applicable for WebAssembly targets. In general the `#[target_feature]` attribute is used to enhance target CPU features enabled beyond the basic level for the rest of the compilation. If done improperly this means that your program could execute an instruction that the CPU you happen to be running on does not understand. This is considered undefined behavior where it is unknown what will happen (e.g. it's not a deterministic `SIGILL`). For WebAssembly, however, the target is different. It is not possible for a running WebAssembly program to execute an instruction that the engine does not understand. If this were the case then the program would not have validated in the first place and would not run at all. Even if this were allowed in some hypothetical future where engines have some form of runtime feature detection (which they do not right now) any implementation of such a feature would generate a trap if a module attempts to execute an instruction the module does not understand. This deterministic trap behavior would still not fall into the category of undefined behavior because the trap is deterministic. For these reasons the `#[target_feature]` attribute is now allowed on safe functions, but only for WebAssembly targets. This notably enables the wasm-SIMD intrinsics proposed for stabilization in rust-lang#74372 to be marked as safe generally instead of today where they're all `unsafe` due to the historical implementation of `#[target_feature]` in the compiler.
…, r=joshtriplett rustc: Allow safe #[target_feature] on wasm This commit updates the compiler's handling of the `#[target_feature]` attribute when applied to functions on WebAssembly-based targets. The compiler in general requires that any functions with `#[target_feature]` are marked as `unsafe` as well, but this commit relaxes the restriction for WebAssembly targets where the attribute can be applied to safe functions as well. The reason this is done is that the motivation for this feature of the compiler is not applicable for WebAssembly targets. In general the `#[target_feature]` attribute is used to enhance target CPU features enabled beyond the basic level for the rest of the compilation. If done improperly this means that your program could execute an instruction that the CPU you happen to be running on does not understand. This is considered undefined behavior where it is unknown what will happen (e.g. it's not a deterministic `SIGILL`). For WebAssembly, however, the target is different. It is not possible for a running WebAssembly program to execute an instruction that the engine does not understand. If this were the case then the program would not have validated in the first place and would not run at all. Even if this were allowed in some hypothetical future where engines have some form of runtime feature detection (which they do not right now) any implementation of such a feature would generate a trap if a module attempts to execute an instruction the module does not understand. This deterministic trap behavior would still not fall into the category of undefined behavior because the trap is deterministic. For these reasons the `#[target_feature]` attribute is now allowed on safe functions, but only for WebAssembly targets. This notably enables the wasm-SIMD intrinsics proposed for stabilization in rust-lang#74372 to be marked as safe generally instead of today where they're all `unsafe` due to the historical implementation of `#[target_feature]` in the compiler.
This is a follow-up from rust-lang/rust#74372 which has finished FCP for the stabilization of wasm intrinsics. This marks them all stable, as-is and additionally marks the functions which create integer vectors as `const`-stable as well. The only remaining unstable bits are that `f32x4` and `f64x2` are `const`-unstable. Mostly just because I couldn't figure out how to make them `const`-stable.
This is a follow-up from rust-lang/rust#74372 which has finished FCP for the stabilization of wasm intrinsics. This marks them all stable, as-is and additionally marks the functions which create integer vectors as `const`-stable as well. The only remaining unstable bits are that `f32x4` and `f64x2` are `const`-unstable. Mostly just because I couldn't figure out how to make them `const`-stable.
I've posted the final PR for stabilization to #86204 |
std: Stabilize wasm simd intrinsics This commit performs two changes to stabilize Rust support for WebAssembly simd intrinsics: * The stdarch submodule is updated to pull in rust-lang/stdarch#1179. * The `wasm_target_feature` feature gate requirement for the `simd128` feature has been removed, stabilizing the name `simd128`. This should conclude the FCP started on rust-lang#74372 and... Closes rust-lang#74372
What will happen if code dependent on SIMD intrinsics will be compiled without the necessary target feature being enabled? Will compiler generate respective SIMD instructions no questions asked (well, it will not inline them, but it's not important for the question)? Shouldn't it generate a compilation error? Otherwise generated WASM may silently become unusable in runtimes without SIMD support (imagine an incorrect change somewhere deep in a project's dependency tree). |
As the documentation indicates:
where the "both options" are using The compiler will always generate simd instructions if you call intrinsics in |
Yes, but on other platforms those intrinsics are Also |
Isn't this a violation of RFC 2045? It seems to me that RFC 2396's requirement to make calling functions with target features on functions that don't have target features unsafe has not been implemented?
@newpavlov the docs you linked mention runtime detection proposals. |
Yes, I know and this is why I asked this question on reddit. But it looks like that @alexcrichton has acted under assumption that dynamic detection will not be added to WASM:
BTW I want to discuss this point from the same comment a bit:
In my understanding, unsafety of |
There's some more discussion on #84988 for reference, but my thoughts on this topic are:
Basically there's gotchas and subtelties about generating modules in WebAssembly that do use simd, don't use simd, or try to dynamically detect simd (assuming some sort of future proposal), but these are all build time concerns. Whatever happens the final module will have some defined semantics based on the instructions its using (which are presumably all stable in the upstream wasm spec with clearly-defined semantics). This means that you'll either run the module on an engine understanding all instructions, in which case everything will behave exactly as expected, or you won't, in which case nothing will execute at all. Overall there is no situation where a unknown wasm instruction is executed. There's situations your build isn't what you expect, but that's not UB that's a configuration issue. |
IIUC with the Also I wonder how in the presence of runtime feature detection compiler will handle functions which use SIMD intrinsics, but do not provide a non-SIMD fallback. Will it simply wrap every intrinsic with a
Yes, it's not UB which would cause memory corruption, but it's still a real pitfall. Is it possible to make use of |
On Thu, Jul 29, 2021 at 03:50:32PM -0700, Artyom Pavlov wrote:
>Wasm engines, forever and all of time, will reject modules they do not understand.
With the `features.supported` proposal engines will accept feature blocks which they do not understand, but instead of parsing it, they will replace the whole with the `unreachable` instruction. It's effectively analogue of the `ud2` instruction from x86. The question is now: is it allowed for safe Rust code to trigger "unreachable" instructions? AFAIK in the case of x86 the answer is no. Do we allow it for WASM?
As far as I know, WebAssembly's "unreachable" is more like
[`unreachable!`](https://doc.rust-lang.org/std/macro.unreachable.html)
(which is entirely safe), not like
[`unreachable_unchecked`](https://doc.rust-lang.org/std/hint/fn.unreachable_unchecked.html)
(which is unsafe and can cause UB).
|
Yeah one of the reasons to make target_feature unsafe was because some instruction sets might override the meaning of instructions or do other UB when they encounter instructions they don't support. On wasm, there is a safe guarantee that it'll trap so it's well defined. |
Recommended in rust-lang#74372
I'm opening this as a tracking issue for the SIMD intrinsics in the
{std,core}::arch::wasm32
module. Eventually we're going to want to stabilize these intrinsics for the WebAssembly target, so I think it's good to have a canonical place to talk about them! I'm also going to update the#![unstable]
annotations to point to this issue to direct users here if they want to use these intrinsics.The WebAssembly simd proposal is currently in "phase 3". I would say that we probably don't want to consider stabilizing these intrinsics until the proposal has at least reached "phase 4" where it's being standardized, because there are still changes to the proposal happening over time (small ones at this point, though). As a brief overview, the WebAssembly simd proposal adds a new type,
v128
, and a suite of instructions to perform data processing with this type. The intention is that this is readily portable to a lot of architectures so usage of SIMD can be fast in lots of places.For rust stabilization purposes the code for all these intrinsics lives in the rust-lang/stdarch git repository. All code lives in
crates/core_arch/src/wasm32/simd128.rs
. I've got a large refactoring and sync queued up for that module, so I'm going to be writing this issue with the assumption that it will land mostly as designed there.Currently the design principles for the SIMD intrinsics are:
memory_size
,memory_grow
andunreachable
intrinsics, most intrinsics are named after the instruction that it represents. There is generally a 1:1 mapping with new instructions added to WebAssembly and intrinsics in the module.#[target_feature(enable = "simd128")]
which forces them all to beunsafe
v128.const
is exposed through a suite ofconst
functions, one for each vector type (but not unsigned, just signed integers). Additionally the arguments are not actually required to be constant, so it's expected that the compiler will make the best choice about how to generate a runtime vector.v8x16_shuffle
and*_{extract,replace}_lane
use const generics to represent constant arguments. This is different from x86_64 which uses the older#[rustc_args_required_const]
attribute.v16x8
,v32x4
, andv64x2
as conveniences instead of only providingv8x16_shuffle
. All of them are implemented in terms of thev8x16.shuffle
instruction, however.v128
type, not a type for each size of vector that intrinsics operate withextract_lane
intrinsics return the value type associated with the intrinsic name, they do not all returni32
unlike the actual WebAssembly instruction. This means that we do not haveextract_lane_s
andextract_lane_u
intrinsics because the compiler will select the appropriate one depending on the context.It's important to note that clang has an implementation of these intrinsics in the
wasm_simd128.h
header. The current design of the Rustwasm32
module is different in that:wasm_*
isn't used.v128
, is exposed instead of types for each size/kind of vectorwasm_i16x8_load_8x8
andwasm_u16x8_load_8x8
while Rust hasi16x8_load8x8_s
andi16x8_load8x8_u
.Most of these differences are largely stylistic, but there are some that are conveniences (like other forms of shuffles) which might be nice to expose in Rust as well. All the conveniences still compile down to one instruction, it's just different how users specify in code how the instruction is generated. I believe it should be possible for conveniences to live outside the standard library as well, however.
How SIMD will be used
If the SIMD proposal were to move to stage 4 today I think we're in a really good spot for stabilization. #74320 is a pretty serious bug we will want to fix before full stabilization but I don't believe the fix will be hard to land in LLVM (I've already talked with some folks on that side).
Other than that SIMD-in-wasm is different from other platforms where a binary with SIMD will refuse to run on engines that do not have SIMD support. In that sense there is no runtime feature detection available to SIMD consumers. (at least not natively)
After rust-lang/stdarch#874 lands programs will simply use
#[target_feature(enable = "...")]
orRUSTFLAGS
and everything should work. The SIMD intrinsics will always be exposed from the standard library (but the standard library itself will not use them) and available to users. If programs don't use the intrinsics then SIMD won't get emitted, otherwise when used the binary will usev128
.Open Questions
A set of things we'll need to settle on before stabilizing (and this will likely expand over time) is:
*_load_*
and*_store_*
instructions. Primarily the instructions that load 64 bits (8x8, 16x4, ...) I'm unsure of on the types of their pointer arguments.v8x16_shuffle
and lane managment instructions.i8x16_extract_lane_s
is ok (e.g. havingi8x16_extract_lane
returningi8
is all we need), same fori16x8
.#[target_feature]
"requires unsafe" rules for these WebAssembly intrinsics. Intrinsic likef32x4_splat
have no fundamental reason they need to beunsafe
. The only reason they're unsafe is because#[target_feature]
is used on them to ensure that SIMD instructions are generated in LLVM.*_{any,all}_true
to returning abool
The text was updated successfully, but these errors were encountered: