Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Shuffle with immediate indices specification #30

Closed
wants to merge 11 commits into from
5 changes: 4 additions & 1 deletion proposals/simd/BinarySIMD.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,4 +167,7 @@ The `v8x16.shuffle2_imm` instruction has 16 bytes after `simdop`.
| `f64x2.convert_s/i64x2` | `0xb1`| - |
| `f64x2.convert_u/i64x2` | `0xb2`| - |
| `v8x16.shuffle1` | `0xc0`| - |
| `v8x16.shuffle2_imm` | `0xc1`| s:LaneIdx32[16] |
| `v8x16.shuffle2_imm` | `0xcc`| s:LaneIdx32[16] |
| `v16x8.shuffle2_imm` | `0xcd`| s:LaneIdx16[8] |
| `v32x4.shuffle2_imm` | `0xce`| s:LaneIdx8[4] |
| `v64x2.shuffle2_imm` | `0xcf`| s:LaneIdx4[2] |
52 changes: 52 additions & 0 deletions proposals/simd/SIMD.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,30 @@ instruction is encoded with 16 bytes providing the indices of the elements to
return. The indices `i` in range `[0, 15]` select the `i`-th element of `a`. The
indices in range `[16, 31]` select the `i - 16`-th element of `b`.

* `v16x8.shuffle(a: v128, b: v128, imm: ImmLaneIdx16[8]) -> v128`

Returns a new vector with lanes selected from the lanes of the two input vectors
`a` and `b` specified in the 8 byte wide immediate mode operand `imm`. This
instruction is encoded with 8 bytes providing the indices of the elements to
return. The indices `i` in range `[0, 7]` select the `i`-th element of `a`. The
indices in range `[8, 15]` select the `i - 8`-th element of `b`.

* `v32x4.shuffle(a: v128, b: v128, imm: ImmLaneIdx8[4]) -> v128`

Returns a new vector with lanes selected from the lanes of the two input vectors
`a` and `b` specified in the 4 byte wide immediate mode operand `imm`. This
instruction is encoded with 4 bytes providing the indices of the elements to
return. The indices `i` in range `[0, 3]` select the `i`-th element of `a`. The
indices in range `[4, 7]` select the `i - 4`-th element of `b`.

* `v64x2.shuffle(a: v128, b: v128, imm: ImmLaneIdx4[2]) -> v128`

Returns a new vector with lanes selected from the lanes of the two input vectors
`a` and `b` specified in the 2 byte wide immediate mode operand `imm`. This
instruction is encoded with 2 bytes providing the indices of the elements to
return. The indices `i` in range `[0, 1]` select the `i`-th element of `a`. The
indices in range `[2, 3]` select the `i - 2`-th element of `b`.

```python
def S.shuffle2_imm(a, b, s):
result = S.New()
Expand Down Expand Up @@ -775,3 +799,31 @@ Lane-wise saturating conversion from floating point to integer using the IEEE
resulting lane is 0. If the rounded integer value of a lane is outside the
range of the destination type, the result is saturated to the nearest
representable integer value.


## Reductions

There is no instruction for reductions.
Instead, one can use permutations to reduce lane-wise operations like `add`, `min`, `max`, `and`, `or`...

Here is an example to reduce add on f32x4:
```
get_local 0
get_local 0
v64x2.shuffle 1 0 ;; swap the lower part with the higher part of the vector
f32x4.add
get_local 0
get_local 0
v32x4.shuffle 1 0 3 2 ;; swap the 2 first elements together, and the 2 last elements together
f32x4.add
f32x4.extract_lane 0 ;; extract the first element
```

Here is an example to reduce add on f64x2:
```
get_local 0
get_local 0
v64x2.shuffle 1 0 ;; swap the lower part with the higher part of the vector
f64x2.add
f64x2.extract_lane 0 ;; extract the first element
```
5 changes: 4 additions & 1 deletion proposals/simd/TextSIMD.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,11 @@ The canonical text format used for printing `v128.const` instructions is
v128.const i32x4 0xNNNNNNNN 0xNNNNNNNN 0xNNNNNNNN 0xNNNNNNNN
```

### v8x16.shuffle2_imm
### Shuffling using immediate indices

```
v8x16.shuffle2_imm i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5
v16x8.shuffle2_imm i4 i4 i4 i4 i4 i4 i4 i4
v32x4.shuffle2_imm i3 i3 i3 i3
v64x2.shuffle2_imm i2 i2
```