Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Allow out-of-range lane indices in swizzle and shuffle instructions #11

Closed
stoklund opened this issue Apr 25, 2017 · 1 comment
Closed

Comments

@stoklund
Copy link
Contributor

As part of #8 I had the opportunity to research the availability and performance of general-purpose shuffle instructions like pshufb. These instructions are widely available and behave like a v8x16.swizzle where the lane indices are provided as an i8x16 vector register instead of as immediate operands. Lanes with an out-of-range selector become 0 in the output vector.

The WebAssembly shuffle and swizzle instructions proposed in #1 can be extended to allow for immediate lane indices that are too large. The corresponding lanes in the output vector would be 0.

Having the possibility of zeroed lanes in the output makes it simpler to combine shuffle results with other vectors using v128.or.

We shouldn't add this feature without examples of code where it is useful.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 8, 2018

PR #30 adds shuffleVar/permuteVar which appear to be equivalent to swizzle (but it also implements it for all other vector lane combinations) . Why was swizzle removed ?

@dtig dtig closed this as completed in #71 Mar 27, 2019
dtig pushed a commit that referenced this issue Mar 27, 2019
This change adds a variable shuffle instruction to SIMD proposal.

When indices are out of range, the result is specified as 0 for each
lane. This matches hardware behavior on ARM and RISCV architectures.

On x86_64 and MIPS, the hardware provides instructions that can select 0
when the high bit is set to 1 (x86_64) or any of the two high bits are
set to 1 (MIPS). On these architectures, the backend is expected to emit
a pair of instructions, saturating add (saturate(x + (128 - 16)) for
x86_64) and permute, to emulate the proposed behavior.

To distinguish variable shuffles with immediate shuffles, existing
v8x16.shuffle instruction is renamed to v8x16.shuffle2_imm to be
explicit about the fact that it shuffles two vectors with an immediate
argument.

This naming scheme allows for adding variants like v8x16.shuffle2 and
v8x16.shuffle1_imm in the future.

Fixes #68.
Contributes to #24.
Fixes #11.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants