-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 'w' and 's' bit to xarch instruction flags. #61198
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis PR addresses #35305. ChangesChange encodes 'w' and 's' in the insFlags struct and INS_FLAG entry for xarch instruction table. In addition, DiscussionPlease see #35305 for an extended discussion on adding the 'w' and 's' bit as INS_FLAGS entries. There is one point that I need some expert knowledge on: whether or not we need
|
33a9ddb
to
cdd0673
Compare
@tannergooding PTAL the community PR. |
I believe the Linux x64 test Interop/PInvoke/Generics/GenericsTest/GenericsTest.sh currently failing on main (#61300) |
The changes generally LGTM. I still need to finish going through the instruction tables to validate the |
src/coreclr/jit/emitxarch.cpp
Outdated
// | ||
// Return Value: | ||
// true if instruction has the 's' bit, false otherwise | ||
bool emitter::HasSBit(instruction ins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For s
in particular, I wonder if there needs to be a specialization of it...
We'll see that all variants have the w
bit; but only 2 variants has the s
bit (and have the same pattern). There is an alternative/shorter encoding that does immediate to AL, AX, or EAX
which doesn't use the s
bit.
Likewise if we look at the 64-bit mode variants:
There are similar cases for the w/s-bit and so "just the flag" doesn't look sufficient to know "does the bit exist and is it valid to set"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The annotations otherwise all look correct for where some encoding for an instruction uses the s
and or w
bits, respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do you think that we need a combination of looking at what form the instruction is in and whether it has the w/s bit to properly implement HasSBit
and HasWBit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's somewhat the concern, yes.
Basically, simply annotating Has_Sbit
and Has_Wbit
isn't enough and might cause later downstream bugs. It really is a few factors that determine if the S/W bit are present and if it means anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I understand.
I have to go back over the tables to understand how many cases there are that might actually arise, or if there is a general enough pattern that most of the logic can be coded in using the Has_Sbit
and Has_Wbit
+ checks in HasSbit
and HasWbit
can be enough, or whether we to specialize the flag encoding further?
Do you have any thoughts on the path forward? I notice that a lot of the cases that do not have s
and w
bit in 64-bit mode are when Rex.W
is set (which I originally thought indicates that 'w' bit is not needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that "w" in the legend for "The value of bit W. in REX has no effect" is actually referring to this, and not the 'w' bit that we have been discussing...
because to my understanding, in 64-bit mode, the operands are always 64-bit for the call instruction and the REX.W can be 0 or 1, but it won't change the operand size for this particular instruction.
The "S" in the legend for "If the value of REX.W. is 1, it overrides the presence of 66H" is used like so
In other words, those 'S' and 'w' legends are not referring to the 's' and 'w' bits. So your initial understanding of "there is a 32-bit instruction form (taking potential operand override prefixes) by setting the lowest bit" is correct.
Shall I update the comments to make this more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that explains it.
It would be great if you could expand the comments on HasWBit
and HasSBit
to include an explanation and perhaps point to the table in the Intel manual. We could also consider something more descriptive such as HasRegularWideForm
/HasRegularWideImmediateForm
with a pointer to this being the w/s bits in the encodings from that section in the Intel manual. But I will leave this up to you -- I don't have a strong opinion as long as it is explained how to interpret the return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some more comments and renamed to HasRegularWideForm. Happy to adjust. Do the comments make it clear what is happening here with respect to the instruction needing the 'w' bit set IF it is in a form where the 'w' bit is present?
I think this is where further refactoring will help per.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the comments explain much. I would add something like:
// Many x86/x64 instructions follow a regular encoding scheme where the
// byte-sized version of an instruction has the lowest bit of the opcode cleared
// while the 32-bit version of the instruction (taking potential prefixes to
// override operand size) has the lowest bit set. This function returns true if
// the instruction follows this format.
// Note that this bit is called `w` in the encoding table in Section B.2 of
// Volume 2 of the Intel Architecture Software Developer Manual.
and
// As above, many instructions taking immediates have a regular form used to
// encode whether the instruction takes a sign-extended 1-byte immediate or a
// (in 64-bit sign-extended) 4-byte immediate, by respectively setting and
// clearing the second lowest bit. This bit is called the s bit in table B.2.
for the other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've gone ahead and updated the comments. Mostly copied what you pasted into the format in the file. Does it look good?
Change encodes 'w' and 's' in the insFlags struct and INS_FLAG entry for xarch instruction table. In addition, `HasWBit` and `HasSBit` check if this flag is set for an instruction, which allows to start simplifying some of the various ad-hoc checks for these bits that were previously done per-instruction throughout emitxarch.cpp.
Co-authored-by: Tanner Gooding <tagoo@outlook.com>
bb4e629
to
c30663b
Compare
The CI asmdiffs leg shows no diffs: https://dev.azure.com/dnceng/public/_build/results?buildId=1499327&view=ms.vss-build-web.run-extensions-tab /azp run Antigen, Fuzzlyn |
/azp run Antigen, Fuzzlyn |
Azure Pipelines successfully started running 2 pipeline(s). |
The Fuzzlyn failures are known. @kunalspathak, can you look if the Antigen failures are known? |
Yes, all failures are known failures. |
src/coreclr/jit/emitxarch.cpp
Outdated
if ((size != EA_1BYTE) && (ins != INS_imul) && (ins != INS_bsf) && (ins != INS_bsr) && (!insIsCMOV(ins)) && | ||
!IsSSEInstruction(ins) && !IsAVXInstruction(ins)) | ||
// Anthony: I believe we may remove !insIsCMOV check, but leaving for comparison purposes with another | ||
// similar check below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits: Seems like the TODO above is outdated with this change, remove it?
Also, I would prefer to not leave author names in comments and to write them in third person or plural first person instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have removed the todo and my comment.
src/coreclr/jit/emitxarch.cpp
Outdated
insIsCMOV(ins)) && | ||
size != EA_1BYTE) | ||
// Anthony: Per L10986, why is this insIsCMOV (which does not have a w bit) | ||
// and above is !insIsCMOV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say exactly why these checks are there, but since we currently do not generate cmov I would be fine with removing the checks and leave it up to future work to make sure the emitter supports them if we decide to start generating it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have gone ahead and removed the comments and the redundant cmov checks.
Thanks! |
This PR addresses #35305.
Changes
Change encodes 'w' and 's' in the insFlags struct and INS_FLAG entry for xarch instruction table. In addition,
HasWBit
andHasSBit
check if this flag is set for an instruction, which allows to start simplifying some of the various ad-hoc checks for these bits that were previously done per-instruction throughout emitxarch.cpp.Discussion
Please see #35305 for an extended discussion on adding the 'w' and 's' bit as INS_FLAGS entries.
There is one point that I need some expert knowledge on: whether or not we need
isInsCMOV
and!isInsCMOV
at two different points. I left comments in the code to draw attention to it: 33a9ddb#diff-6b4e0f32449f2f144e05699f59f74415a564693637f643084a896dfbd081830dR10984 and 33a9ddb#diff-6b4e0f32449f2f144e05699f59f74415a564693637f643084a896dfbd081830dR11449