-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Updating the emitter to more generally handle 4-Byte SSE4 instructions. #16249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM except that I can't figure out why you're not emitting an extra byte in emitOutputRR()
(probably something I've missed).
src/jit/emitxarch.cpp
Outdated
|
||
if ((id->idInsFmt() != IF_RWR_RRD_ARD) && (id->idInsFmt() != IF_RWR_RRD_ARD_CNS)) | ||
// encode source operand reg in 'vvvv' bits in 1's compliement form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here you could fix the typo: "compliement" should be "complement"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it here, and the 12 other places it occurred in this document.
src/jit/emitxarch.cpp
Outdated
@@ -10323,7 +10389,7 @@ BYTE* emitter::emitOutputRR(BYTE* dst, instrDesc* id) | |||
assert(IsAVXInstruction(ins) || IsSSE4Instruction(ins)); | |||
if ((code & 0xFF00) == 0xC000) | |||
{ | |||
dst += emitOutputByte(dst, (0xC0 | regCode)); | |||
dst += emitOutputWord(dst, code | (regCode << 8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably something I'm missing here, but how did this go from emitting a byte to emitting a word?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell based on the comment above, this was trying to support the smaller encoding (which isn't supported anywhere else in the emitter).
I thought I had a comment indicating I was still looking at this in particular, but can't seem to find it anymore (maybe I just forget to submit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CarolEidt, I'm finishing validating locally, but it looks like this is a dead code path and is not currently hit (for emitOutputRR
). I can only speculate (based on the comment above this if block) that it was meant to support the smaller 2-byte VEX prefix scenario, which isn't actually working.
However, for emitOutputInstr
it special cases IF_RRW_RRW_CNS
, that does hit the equivalent code path, and that requires it to emitOutputWord
.
I believe the correct fix is to refactor all three cases where this particular code pattern is (emitOutputRR
, emitOutputRRR
, and IF_RRW_RRW_CNS
in emitOutputInstr
) to do the following:
// TODO-XArch-CQ: Right now support 4-byte opcode instructions only
if ((code & 0xFF00) == 0xC000)
{
dst += emitOutputWord(dst, code | (regCode << 8));
}
else if ((code & 0xFF) == 0x00)
{
// This case happens for SSE4/AVX instructions only
assert(IsAVXInstruction(ins) || IsSSE4Instruction(ins));
dst += emitOutputByte(dst, (code >> 8) & 0xFF);
dst += emitOutputByte(dst, (0xC0 | regCode));
}
else
{
dst += emitOutputWord(dst, code);
dst += emitOutputByte(dst, (0xC0 | regcode));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to the above. I validated that the original code path was never hit for emitOutputRR
and emitOutputRRR
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for checking this out and cleaning it up.
test Windows_NT x64 Checked jitincompletehwintrinsic test Windows_NT x86 Checked jitincompletehwintrinsic test Ubuntu x64 Checked jitincompletehwintrinsic test OSX10.12 x64 Checked jitincompletehwintrinsic |
The following are a separate issue, tracked by https://github.com/dotnet/coreclr/issues/16236:
The following timed out and have been reset:
The following is due to an existing issue: #16249 (comment)
|
x64_checked_windows_nt_jitstressregs4 is not related. The I've logged a bug (https://github.com/dotnet/coreclr/issues/16286) and should have a fix up tonight. |
seems to be related to |
FYI. @CarolEidt, @fiigii, @eerhardt
This should mostly resolve https://github.com/dotnet/coreclr/issues/15908 and https://github.com/dotnet/coreclr/issues/16216, at least for the code paths currently being executed.