-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#2440: Macros for creating conditional instructions #4500
Changes from 4 commits
14bfb4b
08ce2c4
3c13a13
16f30ba
4d8f379
8a9a574
feecbf1
e4cf5af
8195d88
f985439
c84871d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1777,6 +1777,34 @@ opnd_t | |
opnd_create_base_disp_aarch64(reg_id_t base_reg, reg_id_t index_reg, | ||
dr_extend_type_t extend_type, bool scaled, int disp, | ||
dr_opnd_flags_t flags, opnd_size_t size); | ||
DR_API | ||
/** | ||
* Creates an unsigned immediate integer operand for condition defined by enum | ||
* constant \p cond of type #dr_pred_type_t. | ||
* \note AArch64-only. | ||
*/ | ||
opnd_t | ||
opnd_create_cond(int cond); | ||
|
||
DR_API | ||
/** | ||
* Assumes the operand is an unsigned immediate 4-bit integer and returns a | ||
* #dr_pred_type_t constant corresponding to the value of the operand. | ||
* \note AArch64-only. | ||
*/ | ||
int | ||
opnd_get_cond(opnd_t opnd); | ||
|
||
DR_API | ||
/** | ||
* Assumes the operand is an unsigned immediate 4-bit integer and returns an | ||
* operand which contains inverted condition (see #dr_pred_type_t). | ||
* \note Does not support AL and NV conditions. | ||
* \note AArch64-only. | ||
*/ | ||
opnd_t | ||
opnd_invert_cond(opnd_t opnd); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this needs some thought as to how to integrate with existing interfaces and code, especially with ARM code. The API already has The API also already has predicates passed as dr_pred_type_t to things like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be a very big change to move all the condition-related interfaces that currently exist on the level of instructions to the level of operands. I don't think I know enough about other platforms to come up with a uniform solution, and I don't know if such change actually makes sense for x86 (after all Anyway, I'd like to do it in steps rather than in a single big pull request which is very difficult to merge. At least some guidance, suggestions or ideas would be helpful. I would start with #4502, but again as a separate patch. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right now I could remove There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See https://dynamorio.org/dynamorio_docs/API_BT.html#sec_predication which lists one key current use of predicates/conditions:
Another current use, similar to the macros added here:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Why only these instructions? All conditional instructions in AArch64 (all mentioned in this PR) have Also, in my opinion, inventing non-documented opcodes may create extra confusion. If some notion is not common to the rest of the architectures it's not the reason to disallow low-level access as documented. I think in DynamoRIO, when we work at the level of instructions and operands, we don't really have the benefit of "write once run everywhere". In my understanding, the point of the tool is to allow working on the lowest possible level. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
How about compare and swap instructions (e.g. CAS)? The behaviour depends on the value of the register being equal to the value loaded from memory. It is similar to the conditional instructions, although in the latter case the comparison has happened before. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be useful to think of the pseudocode that specifies instruction behaviors. CCMP is specified by one pseudocode function, and the operands are inputs to that function. That single behavior function acts uniformly on all values of the operands. So it makes sense to treat it as one instruction, rather than single out one operand and split it into one instruction for each value of that operand. The situation with SYS instructions is very different (similar to CDP instructions in 32-bit Arm) - they really are separate functions, selected by a function code, and it does make much more sense to split them out, at least when their specific functions are being differently handled by DynamoRIO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we have 3 choices: 1 - Create a new operand for 1 requires a non-trivial amount of work for a type which has essentially the same intent as 2 is not consistent with AArch64's existing codec. We may run into problems as I did in #4386 3 would be my choice. It makes use of an existing type with the same intent which has already been implemented and does not involve a lot of codec reworking.
Which users can call with:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @AssadHashmi highlights what is perhaps the main issue here: using a simple and consistent set of abstractions and types in the IR. If we take each encoded behavior immediate integer that AArch64 presents as an operand (but I would call a sub-opcode and another ISA might well make part of the opcode, such as x86 conditional moves) and place it as a separate operand in DR's IR, we then need to make a new operand type with a new set of rules and enums and handling code and introduce a new concept to tool writers for what could logically fit into existing concepts. We could end up with a dozen new operand types without much benefit but incurring non-trivial costs. Using an existing IR concept keeps the IR simpler and cleaner and more consistent.
We can make it as easy as possible, though. It does make a difference: each IR choice that lets a tool or analysis library be more platform-agnostic makes tool writing easier.
None of any of the discussion here has any bearing on whether a tool writer can generate any precise machine code desired: we're only talking about how to represent ISA concepts in the DR IR. None of the choices affect whether the ISA concept can be represented or whether the tool writer can generate it. And while the IR should be more abstracted and cross-platform, we can and do have layers that are closer to the assembly conventions of that ISA: the INSTR_CREATE_ macros try to look more like assembly, disassembly has both an abstracted DR format and an ISA format, and we'd love to have an even closer layer where assembler strings are converted into IR (can't seem to find it in the tracker...#4510) when writing code for a specific machine. But code analyzing and manipulating at the IR level should have as few types and special operand concepts as possible. |
||
|
||
#endif | ||
|
||
DR_API | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -743,7 +743,85 @@ opnd_create_base_disp_aarch64(reg_id_t base_reg, reg_id_t index_reg, | |
CLIENT_ASSERT(false, "opnd_create_base_disp_aarch64: invalid extend type"); | ||
return opnd; | ||
} | ||
#endif | ||
|
||
opnd_t | ||
opnd_create_cond(int cond) | ||
{ | ||
/* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please create an issue separate from the AArch64 master issue #1569 to track this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Created: #4502. |
||
* to avoid using int instead of dr_pred_type_t */ | ||
switch (cond) { | ||
case DR_PRED_EQ: return opnd_create_immed_uint(0b0000, OPSZ_4b); | ||
case DR_PRED_NE: return opnd_create_immed_uint(0b0001, OPSZ_4b); | ||
case DR_PRED_CS: return opnd_create_immed_uint(0b0010, OPSZ_4b); | ||
case DR_PRED_CC: return opnd_create_immed_uint(0b0011, OPSZ_4b); | ||
case DR_PRED_MI: return opnd_create_immed_uint(0b0100, OPSZ_4b); | ||
case DR_PRED_PL: return opnd_create_immed_uint(0b0101, OPSZ_4b); | ||
case DR_PRED_VS: return opnd_create_immed_uint(0b0110, OPSZ_4b); | ||
case DR_PRED_VC: return opnd_create_immed_uint(0b0111, OPSZ_4b); | ||
case DR_PRED_HI: return opnd_create_immed_uint(0b1000, OPSZ_4b); | ||
case DR_PRED_LS: return opnd_create_immed_uint(0b1001, OPSZ_4b); | ||
case DR_PRED_GE: return opnd_create_immed_uint(0b1010, OPSZ_4b); | ||
case DR_PRED_LT: return opnd_create_immed_uint(0b1011, OPSZ_4b); | ||
case DR_PRED_GT: return opnd_create_immed_uint(0b1100, OPSZ_4b); | ||
case DR_PRED_LE: return opnd_create_immed_uint(0b1101, OPSZ_4b); | ||
case DR_PRED_AL: return opnd_create_immed_uint(0b1110, OPSZ_4b); | ||
case DR_PRED_NV: return opnd_create_immed_uint(0b1111, OPSZ_4b); | ||
default: | ||
CLIENT_ASSERT(false, "invalid condition constant"); | ||
return opnd_create_null(); | ||
} | ||
} | ||
|
||
int | ||
opnd_get_cond(opnd_t opnd) | ||
{ | ||
/* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As above. |
||
* to avoid using int instead of dr_pred_type_t */ | ||
switch (((uint64)opnd_get_immed_int(opnd) & 0xfu)) { | ||
case 0b0000: return DR_PRED_EQ; | ||
case 0b0001: return DR_PRED_NE; | ||
case 0b0010: return DR_PRED_CS; | ||
case 0b0011: return DR_PRED_CC; | ||
case 0b0100: return DR_PRED_MI; | ||
case 0b0101: return DR_PRED_PL; | ||
case 0b0110: return DR_PRED_VS; | ||
case 0b0111: return DR_PRED_VC; | ||
case 0b1000: return DR_PRED_HI; | ||
case 0b1001: return DR_PRED_LS; | ||
case 0b1010: return DR_PRED_GE; | ||
case 0b1011: return DR_PRED_LT; | ||
case 0b1100: return DR_PRED_GT; | ||
case 0b1101: return DR_PRED_LE; | ||
case 0b1110: return DR_PRED_AL; | ||
case 0b1111: return DR_PRED_NV; | ||
} | ||
return DR_PRED_NONE; | ||
} | ||
|
||
opnd_t | ||
opnd_invert_cond(opnd_t opnd) | ||
{ | ||
switch (opnd_get_cond(opnd)) { | ||
case DR_PRED_EQ: return opnd_create_cond(DR_PRED_NE); | ||
case DR_PRED_NE: return opnd_create_cond(DR_PRED_EQ); | ||
case DR_PRED_CS: return opnd_create_cond(DR_PRED_CC); | ||
case DR_PRED_CC: return opnd_create_cond(DR_PRED_CS); | ||
case DR_PRED_MI: return opnd_create_cond(DR_PRED_PL); | ||
case DR_PRED_PL: return opnd_create_cond(DR_PRED_MI); | ||
case DR_PRED_VS: return opnd_create_cond(DR_PRED_VC); | ||
case DR_PRED_VC: return opnd_create_cond(DR_PRED_VS); | ||
case DR_PRED_HI: return opnd_create_cond(DR_PRED_LS); | ||
case DR_PRED_LS: return opnd_create_cond(DR_PRED_HI); | ||
case DR_PRED_GE: return opnd_create_cond(DR_PRED_LT); | ||
case DR_PRED_LT: return opnd_create_cond(DR_PRED_GE); | ||
case DR_PRED_GT: return opnd_create_cond(DR_PRED_LE); | ||
case DR_PRED_LE: return opnd_create_cond(DR_PRED_GT); | ||
default: | ||
CLIENT_ASSERT(false, "invalid condition constant"); | ||
return opnd_create_null(); | ||
} | ||
} | ||
#endif /* AARCH64 */ | ||
|
||
#undef opnd_get_base | ||
#undef opnd_get_disp | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need to be created with the
OPSZ_5b
size? Could we change that and have the encoder handle any declared size? Unless there are multiple encoding variants that differ in immediate sizes, the encoder shouldn't care so long as the value fits in 5 bits? It would be best for usability to not require anyone to specify all these precise bit counts. If a bit count does have to specified, there should be a named constant likeOPSZ_ccmn
or sthg like x86 has forOPSZ_xlat
,OPSZ_lgdt
, etc., and in addition maybe this macro should take just the integer and supply the size or set the size of the opnd before passing it on or sthg.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these instructions we need an unsigned integer immediate operand. The existing API for creating such operands (
opnd_create_immed_uint
) requires to specify anopnd_size_t
value. How do you suggest to create an operand without usingopnd_create_immed_uint
? Or do you suggest to changeopnd_create_immed_uint
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow any size. Or additionally maybe add
opnd_create_immed_uint_unsized
and internally it sets OPSZ_4 or sthg and the a64 encoder ignores the size since it has no encoding decisions based on size.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it is helpful to give an example of how to create a correct operand. It would be better if this macro took just an integer value, but most of
INSTR_CREATE_
macros accept operands. I can removeOPSZ_5b
from the doc string.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. The other suggestions (like an _unsized version) are separate/larger scope from this PR. I'm pretty sure the a64 encoder today ignores the size completely, so the user could pass OPSZ_NA or sthg.