HW intrinsics API declaration is incorrect for Sse41.Insert() that operates on vector of 32-bit floats #10383
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
bug
Milestone
The
[V]INSERTPS
operation differs from similarly named operations that maps to[V]PINSRW
(SSE2+) and[V]PINSRB/D/Q
(SSE4.1+).Here's how it is declared in API:
In fact, the operation either loads the value from [m32] and merges it with source XMM reg at specified position, or merges value of selected 32-bit element from XMM reg (2nd operand) with source XMM reg (1st operand).
Additionally, it can zero some or all elements of result.
Here's how it is implemented in CPU:
INSERTPS (128-bit Legacy SSE version)
VINSERTPS (VEX.128 and EVEX encoded version)
The text was updated successfully, but these errors were encountered: