-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement Sse2 memory fence instructions #16262
Conversation
src/jit/hwintrinsiccodegenxarch.cpp
Outdated
op2Reg = op2->gtRegNum; | ||
ival = Compiler::ivalOfHWIntrinsic(intrinsicID); | ||
emit->emitIns_SIMD_R_R_R_I(ins, emitTypeSize(TYP_SIMD16), targetReg, op1Reg, op2Reg, ival); | ||
|
||
break; | ||
} | ||
|
||
case NI_SSE2_LoadFence: | ||
case NI_SSE2_MemoryFence: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It seems better to just duplicate the asserts here and have the cases handled separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it will be at least couple of cycles less
test Windows_NT x64 Checked jitincompletehwintrinsic test Windows_NT x86 Checked jitincompletehwintrinsic test Ubuntu x64 Checked jitincompletehwintrinsic test OSX10.12 x64 Checked jitincompletehwintrinsic |
test Windows_NT x86 Checked jitx86hwintrinsicnosimd |
@tannergooding The above build timed out. In the other build there are some unrelated errors. Could you have a look and verify what is going on. I see quite a bit of random test failures in several runs in #16237 and in this PR. |
|
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2 |
Failures here are due to |
@tannergooding @fiigii @CarolEidt PTAL The only failure is:
after which there other failures with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; I would like to see some minor changes, but would not hold up the merge for them.
@tannergooding could you review as well?
@@ -744,7 +744,7 @@ void CodeGen::genSSE2Intrinsic(GenTreeHWIntrinsic* node) | |||
regNumber targetReg = node->gtRegNum; | |||
var_types targetType = node->TypeGet(); | |||
var_types baseType = node->gtSIMDBaseType; | |||
instruction ins = Compiler::insOfHWIntrinsic(intrinsicID, baseType); | |||
instruction ins = INS_invalid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but from a stylistic perspective, since this is only used inside a subset of the switch cases, one might move the declaration down into those scopes.
@@ -736,6 +736,8 @@ GenTree* Compiler::impSSE2Intrinsic(NamedIntrinsic intrinsic, | |||
var_types baseType = TYP_UNKNOWN; | |||
var_types retType = TYP_UNKNOWN; | |||
|
|||
assert((simdSize == 16) || (simdSize == 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use a comment; I think that whenever we do the "simdSize == 0" check it deserves a comment, e.g. "A simdSize of zero means that it is not a SIMD type" because it could be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SSE code added:
// The Prefetch and StoreFence intrinsics don't take any SIMD operands
// and have a simdSize of 0
// special case sfence and the prefetch instructions as they never take a VEX prefix | ||
if ((ins == INS_vzeroupper) || (ins == INS_sfence) || (ins == INS_prefetcht0) || (ins == INS_prefetcht1) || | ||
(ins == INS_prefetcht2) || (ins == INS_prefetchnta)) | ||
// special case (l|m|s)fence and the prefetch instructions as they never take a VEX prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We could probably just say special case the fencing and prefetch instructions...
@4creators, let us know if you are going to fix up in this PR, or in a follow up PR, so we know when to merge. |
@tannergooding I would prefer to fix this in next PR due to rerunning tests burden - I am working on all remaining Sse2 in #15777 right now and would prefer to fix it there. |
Thanks for letting us know. Merged. |
No description provided.