Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Complete interp PackedSimd support and expand jiterp support #87903

Merged
merged 1 commit into from
Jul 1, 2023

Conversation

kg
Copy link
Member

@kg kg commented Jun 22, 2023

This PR completes interpreter support for WASM PackedSimd and adds jiterpreter support for most of the intrinsics. I had to make significant changes to how the intrinsics are defined and overhaul transform-simd's PackedSimd code to make it all work, so there's a lot of stuff in here that probably is wrong or needs to be adjusted.

Key points:

  • INTERP_WASM_SIMD_INTRINSIC_V_xxx defines a WASM-only interp simd intrinsic that is auto-generated to wrap a given compiler intrinsic. It also specifies the binary encoding of the wasm operation, which jiterp will use during code generation. The vast majority of the PackedSimd intrinsics are defined this way.
  • INTERP_WASM_SIMD_INTRINSIC_V_Cx defines a WASM-only interp simd intrinsic that is hand-written in C like the regular SIMD intrinsics. Most of these have special implementations in jiterp, and they tend to have weird calling conventions (like accepting constant parameter values or pointers).
  • A few of the intrinsics (Store and its siblings) have a void return type; I handle them as if they have a return type and I just never write to the dreg. This is probably incorrect, but it wasn't clear to me how it should actually work so I left it as-is.
  • If an intrinsic is overloaded we select the correct opcode/compiler intrinsic by looking at the type of its first parameter and matching it against all the table entries with a given name. For example, public static unsafe Vector128<float> LoadScalarVector128(float* address) will match the entry INTERP_WASM_SIMD_INTRINSIC_V_C1 (LoadScalarVector128, X4, interp_packedsimd_load32_zero, 0x5c) because the type criteria X4 means 'any 4-byte type'; static uint ExtractLane(Vector128<byte> value, [ConstantExpected(Max = (byte)(15))] byte index) will match INTERP_WASM_SIMD_INTRINSIC_V_C2 (ExtractLane, U1, interp_packedsimd_extractlane_u1, 0x16) due to the U1 type criteria.
  • If an intrinsic accepts a constant as a parameter, the interpreter does not enforce this (it's nontrivial to do so and provides no advantages), and we use a hand-written C implementation instead of a compiler intrinsic. However, the jiterpreter's implementation of these operations does full validation of the constants, and if it sees a non-constant argument value or an out-of-range value, it will log an error to the console and fall back to the interpreter's C implementation. This is more permissive than AOT but should provide good feedback for bad code along with near-optimal performance for correct code.
  • We don't have automated tests for PackedSimd, so I haven't been able to verify all of the intrinsics. I wrote some local tests for key ones and used those to verify that the interp version of the relevant intrinsics worked before adding jiterp support and verifying that.
  • In transform-simd we look up an intrinsic in the table by first binary searching for any entry with a matching name, then scanning forward and backward simultaneously to look for an entry that also has a matching type criteria. This algorithm probably still has a bugs in it somewhere. The alternative (doing it with a bunch of hand-written switch cases like for the other APIs) was not maintainable :(
  • Similarly, the table of PackedSimd intrinsics is sorted at runtime on first use (with atomics for thread safety) because maintaining a proper ASCII sort of the table entries by hand is error-prone.

Expands on #87719 (and currently contains its changes since it hasn't passed review; maybe this should just replace it.)

@kg kg added arch-wasm WebAssembly architecture area-Codegen-Interpreter-mono labels Jun 22, 2023
@ghost
Copy link

ghost commented Jun 22, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

This draft PR will add support for the missing PackedSimd intrinsics that require constants. Right now it only implements ExtractLane and ReplaceLane, and the jiterpreter doesn't handle the new opcodes.

Depends on #87719.

Author: kg
Assignees: -
Labels:

arch-wasm, area-Codegen-Interpreter-mono

Milestone: -

@ghost ghost assigned kg Jun 22, 2023
@kg kg force-pushed the wasm-packedsimd-3 branch from ba75803 to a3efabf Compare June 22, 2023 22:29
@kg
Copy link
Member Author

kg commented Jun 23, 2023

Revised based on feedback from Vlad. Much simpler now. Also added jiterp support.

@kg kg changed the title [wasm] Add interp support for SIMD intrinsics that accept constants [wasm] Complete interp PackedSimd support and expand jiterp support Jun 23, 2023
@kg kg marked this pull request as ready for review June 24, 2023 00:09
Custom PackedSimd.Shuffle in jiterp
Simplify definition of intrinsics with custom C implementations
Clean up dummy interp dregs
Many other commits squashed
@kg kg force-pushed the wasm-packedsimd-3 branch from da3f742 to 1b237a2 Compare July 1, 2023 08:10
@kg
Copy link
Member Author

kg commented Jul 1, 2023

Since the cla check hung and there is no option to re-run it I rebased the PR onto main to hopefully trigger the checks again.

@kg kg merged commit 0185099 into dotnet:main Jul 1, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants