-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115
Conversation
This patch changes the instrs in codec.txt to be alphabetically ordered rather than grouped by semi-random categories. It also introduces codecsort.py which can manage and enforce this ordering. Issue: #2626 Change-Id: I2d8769602d837d2e02acad820bf78e1b83d10622
Change-Id: If9a2684b2c16ea510a008b6ace8d72c28a75759d
I'm curious why we need to keep it ordered. I rather liked the previous structure where we grouped the opcodes like "Advanced SIMD", "Floating-point" etc. I was searching for advanced SIMD opcodes and found that those convenient comments were no longer present. |
The existing groupings were not consistent and had become confused. The alphanumeric ordering is to make it easier to look for and add specific instructions during high rates of What would be the preferred layout? Possible options are:
How coarse grained do we want the groupings? The coarsest is INT, FP and MEM, or something like the above with higher resolution. |
Does changing the ordering of codec.txt change the exported interface enum of the opcodes? If so, that's a compatibility break, and we would want to be careful about that. A compatibility message needs to be added to the changelog in release.dox vs the 8.0 release 4/21/20, and between releases it's still best to avoid such changes (adds pain to user experience if every cronbuild changes the enum order). |
No, the exported enum of opcodes has always been and will remain numbering in alphabetical order of opcode name, independent of ordering in |
So every opcode addition to codec.txt changes the interface: this may cause problems. Presumably binary compatibility has already been broken with the 8.0 release, and if we put out another release, it will immediately break on the next change. I think this alphabetizing may need to be reconsidered especially long-term. |
How did x86 handle a fixed opcode<->enum mapping during development? |
IIRC the initial set of pre-SSE insructions was in place before any public interface. Every set of additional opcodes added has appended to the end to avoid breaking compatibility: mostly as sets grouped by ISA feature (SSE2, AVX, etc.) but even some from prior sets that were accidentally missed were appended such as at https://github.com/DynamoRIO/dynamorio/blob/master/core/ir/x86/decode_table.c#L1091 |
Given a number of recent users hitting problems that might have been avoided with a more recent build, and with the last official release 8.0 from a full 18 months ago (8.0 was Apr 21, 2020), we're thinking we should put out a new official release. IMHO it would be best to not change the opcode ordering across that release -- so the proposal is to change the codec to freeze the current ordering and append new opcodes after the freeze. Does that sound reasonable, and is that something that could be done in the next week or two and then we could put out a new release once that's in place? |
Do you mean we should not make any changes, freezing the current |
I think we're ok breaking compatibility with 8.0 given the other changes we also have there. The proposal is to freeze the OP_ enum ordering at the point of the new release forever and only append to it afterward. A complication might be the opcode splitting in #4388, #4386 (comment), #4393. But if those are implemented after the new release I think we would just live with any newly split opcodes being appended. |
Understood. We're thinking of adding an index number to encoding definitions in |
@derekbruening @abhinav92003 Is there any reason why we can't use |
@derekbruening @abhinav92003 What are the (example) use-cases which require that opcodes' enums are the same across DynamoRIO releases? I'm thinking of opcodes which are augmented as new versions of the ISA are released, e.g. for MMX, SSE and AVX on Intel x86, and for v8.0, v8.2 and SVE on AArch64. On AArch64 we have the same opcode name for integer, fixed-width vector and scalable vector versions, e.g.
Do we want the enum to be the same all the way through? |
I should have posted these questions on the relevant issue #5144 |
This patch changes the instrs in codec.txt to be alphabetically
ordered rather than grouped by semi-random categories. It
also introduces codecsort.py which can manage and enforce this
ordering.
Issue: #2626