This repository was archived by the owner on Dec 22, 2021. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
This is proposal to add 64-bit variant of existing
ne
instruction. This is motivated by the proposal to add 64-bit variant ofeq
instruction in #381 and the decision on #351 to keepne
instructions. The only instruction set to natively support this instruction is AMD XOP, but on ARM64 and x86 (since SSE4.1) the lowering is no worse than for otherne
forms.Mapping to Common Instruction Sets
This section illustrates how the new WebAssembly instructions can be lowered on common instruction sets. However, these patterns are provided only for convenience, compliant WebAssembly implementations do not have to follow the same code generation patterns.
x86/x86-64 processors with AVX512F and AVX512VL instruction sets:
y = i64x2.ne(a, b)
is lowered toVPCMPEQQ xmm_y, xmm_a, xmm_b
+VPTERNLOGQ xmm_y, xmm_y, xmm_y, 0x55
x86/x86-64 processors with XOP instruction set
y = i64x2.ne(a, b)
is lowered toVPCOMEQQ xmm_y, xmm_a, xmm_b
x86/x86-64 processors with AVX instruction set
y = i64x2.ne(a, b)
is lowered toVPCMPEQQ xmm_y, xmm_a, xmm_b
+VPXOR xmm_y, xmm_y, [wasm_i64x2_splat(-1)]
x86/x86-64 processors with SSE4.1 instruction set
y = i64x2.ne(a, b)
is lowered to:MOVDQA xmm_y, xmm_a
PCMPEQQ xmm_y, xmm_b
PXOR xmm_y, [wasm_i64x2_splat(-1)]
x86/x86-64 processors with SSE2 instruction set
y = i64x2.ne(a, b)
is lowered to:MOVDQA xmm_y, xmm_a
PCMPEQD xmm_y, xmm_b
PSHUFD xmm_tmp, xmm_y, 0xB1
PAND xmm_y, xmm_tmp
PXOR xmm_y, [wasm_i64x2_splat(-1)]
ARM64 processors
y = i64x2.ne(a, b)
is lowered toCMEQ Vy.2D, Va.2D, Vb.2D
+MVN Vy.16B, Vy.16B
ARMv7 processors with NEON instruction set
y = i64x2.ne(a, b)
is lowered to:VCEQ.I32 Qy, Qa, Qb
VREV64.32 Qtmp, Qy
VAND Qy, Qtmp
VMVN Qy, Qy