i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115

joshua-warburton · 2021-09-21T15:03:07Z

This patch changes the instrs in codec.txt to be alphabetically
ordered rather than grouped by semi-random categories. It
also introduces codecsort.py which can manage and enforce this
ordering.

Issue: #2626

This patch changes the instrs in codec.txt to be alphabetically ordered rather than grouped by semi-random categories. It also introduces codecsort.py which can manage and enforce this ordering. Issue: #2626 Change-Id: I2d8769602d837d2e02acad820bf78e1b83d10622

Change-Id: If9a2684b2c16ea510a008b6ace8d72c28a75759d

abhinav92003 · 2021-09-24T12:25:39Z

I'm curious why we need to keep it ordered. I rather liked the previous structure where we grouped the opcodes like "Advanced SIMD", "Floating-point" etc. I was searching for advanced SIMD opcodes and found that those convenient comments were no longer present.

AssadHashmi · 2021-09-27T13:23:02Z

The existing groupings were not consistent and had become confused. The alphanumeric ordering is to make it easier to look for and add specific instructions during high rates of codec.txt churn, rather than browse groups of similar instructions in a stable codec.txt. This file is going to get bigger as we complete v8.0 support and for the purposes of quicker completion we decided to mirror the ordering in the reference manual. Once v8.0 is complete we can re-organise a stable codec.txt along different more browsable criteria.

What would be the preferred layout? Possible options are:

Leave it in alphanumeric order mirroring the reference manual.
Leave it in alphanumeric order and have group comment tags at the end of each defintion, e.g.

# COND
# INT GPR
# INT MEM GPR
# FP MEM SCALAR
# FP MEM VECTOR
. . .

Rearrange by the above groupings, each group having its own alphanumeric ordering.

How coarse grained do we want the groupings? The coarsest is INT, FP and MEM, or something like the above with higher resolution.

derekbruening · 2021-09-27T14:46:49Z

Does changing the ordering of codec.txt change the exported interface enum of the opcodes? If so, that's a compatibility break, and we would want to be careful about that. A compatibility message needs to be added to the changelog in release.dox vs the 8.0 release 4/21/20, and between releases it's still best to avoid such changes (adds pain to user experience if every cronbuild changes the enum order).

AssadHashmi · 2021-09-27T16:27:49Z

Does changing the ordering of codec.txt change the exported interface enum of the opcodes?

No, the exported enum of opcodes has always been and will remain numbering in alphabetical order of opcode name, independent of ordering in codec.txt.

derekbruening · 2021-09-27T16:32:07Z

No, the exported enum of opcodes has always been and will remain numbering in alphabetical order of opcode name, independent of ordering in codec.txt.

So every opcode addition to codec.txt changes the interface: this may cause problems. Presumably binary compatibility has already been broken with the 8.0 release, and if we put out another release, it will immediately break on the next change. I think this alphabetizing may need to be reconsidered especially long-term.

AssadHashmi · 2021-09-27T16:38:12Z

How did x86 handle a fixed opcode<->enum mapping during development?
Presumably, the mapping was changing until the full decoder/encoder had been implemented?

derekbruening · 2021-09-27T16:54:28Z

How did x86 handle a fixed opcode<->enum mapping during development?
Presumably, the mapping was changing until the full decoder/encoder had been implemented?

IIRC the initial set of pre-SSE insructions was in place before any public interface. Every set of additional opcodes added has appended to the end to avoid breaking compatibility: mostly as sets grouped by ISA feature (SSE2, AVX, etc.) but even some from prior sets that were accidentally missed were appended such as at https://github.com/DynamoRIO/dynamorio/blob/master/core/ir/x86/decode_table.c#L1091

derekbruening · 2021-09-30T15:44:29Z

Given a number of recent users hitting problems that might have been avoided with a more recent build, and with the last official release 8.0 from a full 18 months ago (8.0 was Apr 21, 2020), we're thinking we should put out a new official release. IMHO it would be best to not change the opcode ordering across that release -- so the proposal is to change the codec to freeze the current ordering and append new opcodes after the freeze. Does that sound reasonable, and is that something that could be done in the next week or two and then we could put out a new release once that's in place?

AssadHashmi · 2021-09-30T16:56:27Z

Given a number of recent users hitting problems that might have been avoided with a more recent build, and with the last official release 8.0 from a full 18 months ago (8.0 was Apr 21, 2020), we're thinking we should put out a new official release. IMHO it would be best to not change the opcode ordering across that release -- so the proposal is to change the codec to freeze the current ordering and append new opcodes after the freeze. Does that sound reasonable, and is that something that could be done in the next week or two and then we could put out a new release once that's in place?

Do you mean we should not make any changes, freezing the current codec.txt until the new release is done?
Or revert this change to the previous ordering and freeze until the new release is done?

derekbruening · 2021-09-30T18:37:41Z

Given a number of recent users hitting problems that might have been avoided with a more recent build, and with the last official release 8.0 from a full 18 months ago (8.0 was Apr 21, 2020), we're thinking we should put out a new official release. IMHO it would be best to not change the opcode ordering across that release -- so the proposal is to change the codec to freeze the current ordering and append new opcodes after the freeze. Does that sound reasonable, and is that something that could be done in the next week or two and then we could put out a new release once that's in place?

Do you mean we should not make any changes, freezing the current codec.txt until the new release is done? Or revert this change to the previous ordering and freeze until the new release is done?

I think we're ok breaking compatibility with 8.0 given the other changes we also have there.

The proposal is to freeze the OP_ enum ordering at the point of the new release forever and only append to it afterward.
This could be done by adding all the known opcodes in the Armv8.N target you're focusing on right now but w/o full decoding (even leaving as OP_xx; simply getting them into the OP_ array); or by somehow marking in codec.txt which ones are in this release and ordering them first or something.

A complication might be the opcode splitting in #4388, #4386 (comment), #4393. But if those are implemented after the new release I think we would just live with any newly split opcodes being appended.

AssadHashmi · 2021-10-01T12:08:31Z

The proposal is to freeze the OP_ enum ordering at the point of the new release forever and only append to it afterward. This could be done by adding all the known opcodes in the Armv8.N target you're focusing on right now but w/o full decoding (even leaving as OP_xx; simply getting them into the OP_ array); or by somehow marking in codec.txt which ones are in this release and ordering them first or something.

Understood. We're thinking of adding an index number to encoding definitions in codec.txt which is the same as the OP_ enum and will not change once set. We may define all v8.0 encodings, in order to fix the enum set, even those not yet implemented.

AssadHashmi · 2021-10-06T09:55:35Z

@derekbruening @abhinav92003 Is there any reason why we can't use OP_UNDECODED rather than OP_xx on AArch64?
I don't know the history of OP_xx and if we can we should replace it.

AssadHashmi · 2021-10-06T10:55:47Z

@derekbruening @abhinav92003 What are the (example) use-cases which require that opcodes' enums are the same across DynamoRIO releases?

I'm thinking of opcodes which are augmented as new versions of the ISA are released, e.g. for MMX, SSE and AVX on Intel x86, and for v8.0, v8.2 and SVE on AArch64.

On AArch64 we have the same opcode name for integer, fixed-width vector and scalable vector versions, e.g. ADD :

v8.0      ADD <Xd|SP>, <Xn|SP>, <R><m>{, <extend> {#<amount>}}
v8.0,v8.2 ADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>
SVE       ADD <Zdn>.<T>, <Pg>/M, <Zdn>.<T>, <Zm>.<T>

Do we want the enum to be the same all the way through?

AssadHashmi · 2021-10-06T15:06:29Z

I should have posted these questions on the relevant issue #5144
Let's continue any further discussion there.

joshua-warburton added 2 commits September 21, 2021 15:33

don't generate trailing whitespace

442bf6a

Change-Id: If9a2684b2c16ea510a008b6ace8d72c28a75759d

AssadHashmi approved these changes Sep 21, 2021

View reviewed changes

joshua-warburton added 2 commits September 21, 2021 16:46

Merge branch 'master' into i2626-instr-reorder

0ed0cf9

Merge branch 'master' into i2626-instr-reorder

4fcaf13

joshua-warburton merged commit 545f320 into master Sep 23, 2021

joshua-warburton deleted the i2626-instr-reorder branch September 23, 2021 10:03

AssadHashmi mentioned this pull request Oct 5, 2021

Set and fix opcode enums on AArch64 v8.0 for all future releases #5144

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115

i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115

joshua-warburton commented Sep 21, 2021

abhinav92003 commented Sep 24, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

derekbruening commented Sep 30, 2021 •

edited

Loading

AssadHashmi commented Sep 30, 2021

derekbruening commented Sep 30, 2021

AssadHashmi commented Oct 1, 2021

AssadHashmi commented Oct 6, 2021

AssadHashmi commented Oct 6, 2021

AssadHashmi commented Oct 6, 2021

i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115

i#2626: AArch64 v8.0 decode: Reorder instrs in codec.txt #5115

Conversation

joshua-warburton commented Sep 21, 2021

abhinav92003 commented Sep 24, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

AssadHashmi commented Sep 27, 2021

derekbruening commented Sep 27, 2021

derekbruening commented Sep 30, 2021 • edited Loading

AssadHashmi commented Sep 30, 2021

derekbruening commented Sep 30, 2021

AssadHashmi commented Oct 1, 2021

AssadHashmi commented Oct 6, 2021

AssadHashmi commented Oct 6, 2021

AssadHashmi commented Oct 6, 2021

derekbruening commented Sep 30, 2021 •

edited

Loading