Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set and fix opcode enums on AArch64 v8.0 for all future releases #5144

Closed
AssadHashmi opened this issue Oct 5, 2021 · 3 comments
Closed

Comments

@AssadHashmi
Copy link
Contributor

The current codec generator for AArch64 does not fix the enum values of OP_s, see discussion in #5115.
This issue will track work to set and fix opcode enum values, initially for v8.0.

@AssadHashmi
Copy link
Contributor Author

Oops! The two queries I posted on #5115 should have been posted here. See:
#5115 (comment)
#5115 (comment)

@derekbruening
Copy link
Contributor

@derekbruening @abhinav92003 Is there any reason why we can't use OP_UNDECODED rather than OP_xx on AArch64?
I don't know the history of OP_xx and if we can we should replace it.

OP_UNDECODED is used for "level 0" or "level 1" decodings on x86 where we have done a fast incomplete decoding pass and just haven't spent enough time to deduce the actual opcodes (or for level 0 it's a bundle of multiple instructions).

OP_xx is used for fully decoded encodings that we are unable to get more information about: more decoding effort will not yield further information. They may well be valid but are not yet known to the unfinished decoder implementation.

There are functions that assume if they see OP_UNDECODED they should try to decode further. So it is probably best to keep them separate.

@derekbruening
Copy link
Contributor

derekbruening commented Oct 6, 2021

@derekbruening @abhinav92003 What are the (example) use-cases which require that opcodes' enums are the same across DynamoRIO releases?

Binary compatibility: I have a client library built against 9.0 and I run it with 9.1. It will break in weird ways if the integers passed as enums now mean something completely different. We would prefer to maintain as much binary compatibility as possible; IMHO it is painful to have library interfaces change all the time. Imagine if libc changed every minor release: an update to your system upgrades libc and suddenly all your own binaries crash in strange ways.

On AArch64 we have the same opcode name for integer, fixed-width vector and scalable vector versions, e.g. ADD :

v8.0 ADD <Xd|SP>, <Xn|SP>, {, {#}}
v8.0,v8.2 ADD ., ., .
SVE ADD ., /M, ., .
Do we want the enum to be the same all the way through?

If it makes things easier splitting the opcodes with OP_add_<foo> suffixes or something is a step we have taken in the past. Our x86 opcodes do not strictly match the ISA: we split MOV into several different pieces (e.g., immediates vs loads vs stores vs debug regs) to improve encoding performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants