Skip to content

Commit

Permalink
Fix asm modifiers in add_dpbusd_epi32x2 implementations
Browse files Browse the repository at this point in the history
The accumulator should be an earlyclobber because it is written before
all input operands are read. Otherwise, the asm code computes a wrong
result if the accumulator shares a register with one of the other input
operands (which happens if we pass in the same expression for the
accumulator and the operand).

Closes #4339

No functional change
  • Loading branch information
UniQP authored and vondele committed Jan 22, 2023
1 parent 3d2381d commit da5bcec
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/nnue/layers/simd.h
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ namespace Stockfish::Simd {
asm(
"vpdpbusd %[b0], %[a0], %[acc]\n\t"
"vpdpbusd %[b1], %[a1], %[acc]\n\t"
: [acc]"+v"(acc)
: [acc]"+&v"(acc)
: [a0]"v"(a0), [b0]"vm"(b0), [a1]"v"(a1), [b1]"vm"(b1)
);
# else
Expand Down Expand Up @@ -249,7 +249,7 @@ namespace Stockfish::Simd {
asm(
VNNI_PREFIX "vpdpbusd %[b0], %[a0], %[acc]\n\t"
VNNI_PREFIX "vpdpbusd %[b1], %[a1], %[acc]\n\t"
: [acc]"+v"(acc)
: [acc]"+&v"(acc)
: [a0]"v"(a0), [b0]"vm"(b0), [a1]"v"(a1), [b1]"vm"(b1)
);
# else
Expand Down

0 comments on commit da5bcec

Please sign in to comment.