[AArch64] SME instructions with indexed operands do not have correct disassembly information #2285

FinnWilkinson · 2024-03-07T12:44:08Z

Working on the latest next branch, disassembling a range of SME instructions yields incorrect disassembly information, often with additional memory operands being defined.

It seems to be an issue with the pattern matching of [...]. A similar issue occurred when implementing PR #1907 and was fixed in PR #1925. I'm not sure how different the backend is now after the auto-sync update, but perhaps a similar fix can be implemented from this?

Listed below are some examples:

Issue - Operand 2 p3.s's index is identified as an additional memory operand.

cstool -d aarch64 67447125
 0  67 44 71 25  psel	p7, p1, p3.s[w13, 1]
	ID: 735 (psel)
	op_count: 4
		operands[0].type: REG = p7
		operands[0].access: WRITE
		operands[1].type: REG = p1
		operands[1].access: READ
		operands[2].type: REG = p3
		operands[2].access: READ
			Vector Arrangement Specifier: 0x20
		operands[3].type: MEM
			operands[3].mem.base: REG = w13
			operands[3].mem.disp: 0x1
	Registers read: p1 p3 w13
	Registers modified: p7
	Groups: HasSVE2p1_or_HasSME

Issue - Operand p4 is mistaken for a memory operand and x10 as its register offset. Also, operands[0] should not have an access specifier, and operands[2] should not have a vector index

cstool -d aarch64 4131a2e0
 0  41 31 a2 e0  st1w	{za0h.s[w13, 1]}, p4, [x10, x2, lsl #2]
	ID: 1047 (st1w)
	op_count: 3
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za0.s
		operands[0].sme.slice_reg: w13
		operands[0].sme.slice_offset: 1
		operands[0].sme.is_vertical: false
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
			Vector Index: 0
		operands[1].type: MEM
			operands[1].mem.base: REG = p4
			operands[1].mem.index: REG = x10
		operands[1].access: WRITE
			Shift: type = 1, value = 2
			Vector Index: 0
		operands[2].type: MEM
			operands[2].mem.base: REG = x2
		operands[2].access: WRITE
			Vector Index: 0
	Registers read: p4 x10 x2
	Groups: HasSME

Issue - Same as above.

cstool -d aarch64 c0089fe0
 0  c0 08 9f e0  ld1w	{za0h.s[w12, 0]}, p2/z, [x6]
	ID: 434 (ld1w)
	Is alias: 1354 (ld1w) with ALIAS operand set
	op_count: 2
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za0.s
		operands[0].sme.slice_reg: w12
		operands[0].sme.slice_offset: 0
		operands[0].sme.is_vertical: false
		operands[0].access: READ
			Vector Arrangement Specifier: 0x20
			Vector Index: 0
		operands[1].type: MEM
			operands[1].mem.base: REG = p2
			operands[1].mem.index: REG = x6
		operands[1].access: READ
			Vector Index: 0
	Registers read: p2 x6
	Groups: HasSME

The text was updated successfully, but these errors were encountered:

FinnWilkinson · 2024-03-07T16:25:16Z

Looking into this a bit further, it seems to originate from the AArch64GenCSMappingInsnOp.inc file where all SME based instructions with an index have CS_OP_MEM.

In capstone's LLVM, llvm/utils/TableGen/PrinterCapstone.cpp the function getCSOperandType contains

if (TargetName.equals("AArch64") && OperandType != "CS_OP_MEM") {
    // The definitions of AArch64 are so broken, when it comes to memory
    // operands, that we just search for the op name enclosed in [].
    if (Regex("\\[.*\\$" + OpName.str() + ".*]").match(CGI->AsmString))
      return OperandType += " | CS_OP_MEM";
  }

Which would explain the above issues with LD1W and ST1W whereby p2 is seen as a memory operand (The open bracket [ is part of za0h's index, the close bracket ] is part of the actual memory operand

I think it also explains the issue with PSEL as all predicate registers are recognised correctly but p3's index is seen as a memory operand rather than its index.

Looking at other SME instructions in AArch64GenCSMappingInsnOp.inc, instruction opcode AArch64_ADD_VG2_M2Z2Z_D could potentially present a similar issue to PSEL where ZA's index is seen as a memory operand rather than an index --- again this could be explained by the above regex

{ /* AArch64_ADD_VG2_M2Z2Z_D (1265) - AArch64_INS_ADD - add	$ZAd[$Rv, $imm3, vgx2], $Zn, $Zm */
{
  { CS_OP_REG, CS_AC_WRITE, { CS_DATA_TYPE_Untyped, CS_DATA_TYPE_LAST } }, /* ZAd */
  { CS_OP_REG, CS_AC_READ, { CS_DATA_TYPE_Untyped, CS_DATA_TYPE_LAST } }, /* _ZAd */
  { CS_OP_REG | CS_OP_MEM, CS_AC_INVALID, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* Rv */
  { CS_OP_IMM | CS_OP_MEM, CS_AC_INVALID, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* imm3 */
  { CS_OP_REG, CS_AC_READ, { CS_DATA_TYPE_Untyped, CS_DATA_TYPE_LAST } }, /* Zn */
  { CS_OP_REG, CS_AC_READ, { CS_DATA_TYPE_Untyped, CS_DATA_TYPE_LAST } }, /* Zm */
  { 0 }
}},

FinnWilkinson · 2024-03-08T10:19:36Z

I'm not 100% sure on how to implement a fix for this, but for the regex above; all AArch64 memory operands will be preceeded by , (i.e. , [x6]) and so the regex needs to accout for this. If [...] is not preceeded by a comma and a space then we know it to be an index.

Additionally, said regex should check that there is no ] before the OpName.str() to bulletproof it from issues seen with PSEL. It may also require another aarch64_op_type and/or member of the union in cs_aarch64_op as PSEL does not nicely fit the criteria for aarch64_op_sme but still requires a register to have an index with a base register and immidiate offset.

As far as I can tell, other AArch64 instructions do not seem to be affected, only SME/SME2.

Rot127 · 2024-03-09T05:57:31Z

Thanks a lot for the detailed issue! Currently I am updating to LLVM 18 and will add your fix after it.

Couldn't find the time yet to address the other issues in #2196 unfortunately. But will address them and other bugs after the LLVM 18 update. Since it adds new AArch64 instructions anyways.

FinnWilkinson · 2024-03-11T09:24:48Z

Great, thank you for the update!

Rot127 · 2024-05-16T06:58:35Z

@FinnWilkinson I am currently at this. Thanks for spotting the faulty regex. It is fixed now.

Regarding the representation of SME operands.
What do you think about having two types of SME operands. Once an indexed matrix (which is the struct currently called aarch64_op_sme) and an "indexed" predicate reg op?

Something like:

typedef struct {
  aarch64_sme_op_type type; ///< AArch64_SME_OP_TILE, AArch64_SME_OP_TILE_VEC
  aarch64_reg tile; ///< Matrix tile register
  aarch64_reg slice_reg; ///< slice index reg
  // ...
} aarch64_op_sme_matrix;

typedef struct {
  // Some type field
  aarch64_reg pred_reg;
  aarch64_reg vec_select;
  int32_t index; ///< Potentially also a union of other ops which can be used here.
} aarch64_op_sme_pred;

typedef struct {
  // Some type field
  union {
    aarch64_op_sme_matrix matrix;
    aarch64_op_sme_pred pred;
  }
} aarch64_sme_op;

I am a little hesitant to add more operands. But the SME ops don't really fit in the previous ones. So I'd really appreciate your comment on this, since you seem to work with this AArch64 extension a lot.

As far as I can tell, other AArch64 instructions do not seem to be affected, only SME/SME2.

Yeah, SME/SME2 is really not that well defined in LLVM, unfortunately. So they need all kind of hacky solutions.

Rot127 · 2024-05-20T10:59:07Z

This is now the resulting output:

cstool -d aarch64 c0089fe04131a2e067447125
 0  c0 08 9f e0  ld1w	{za0h.s[w12, 0]}, p2/z, [x6]
	ID: 474 (ld1w)
	Is alias: 1466 (ld1w) with ALIAS operand set
	op_count: 3
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.mx.tile: za0.s
		operands[0].sme.mx.slice_reg: w12
		operands[0].sme.mx.slice_offset: 0
		operands[0].sme.mx.is_vertical: false
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: SME_PRED
		operands[1].sme.pred.reg: p2
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x6
		operands[2].access: READ
	Registers read: za0.s w12 p2 x6
	Groups: HasSME 

 4  41 31 a2 e0  st1w	{za0h.s[w13, 1]}, p4, [x10, x2, lsl #2]
	ID: 1099 (st1w)
	op_count: 3
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.mx.tile: za0.s
		operands[0].sme.mx.slice_reg: w13
		operands[0].sme.mx.slice_offset: 1
		operands[0].sme.mx.is_vertical: false
		operands[0].access: READ
			Vector Arrangement Specifier: 0x20
		operands[1].type: SME_PRED
		operands[1].sme.pred.reg: p4
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x10
			operands[2].mem.index: REG = x2
		operands[2].access: WRITE
			Shift: type = 1, value = 2
	Registers read: za0.s w13 p4 x10 x2
	Groups: HasSME 

 8  67 44 71 25  psel	p7, p1, p3.s[w13, 1]
	ID: 785 (psel)
	op_count: 3
		operands[0].type: SME_PRED
		operands[0].sme.pred.reg: p7
		operands[0].access: WRITE
		operands[1].type: SME_PRED
		operands[1].sme.pred.reg: p1
		operands[1].access: READ
		operands[2].type: SME_PRED
		operands[2].sme.pred.reg: p3
		operands[2].sme.pred.vec_select: w13
		operands[2].sme.pred.index: 1
		operands[2].access: READ
			Vector Arrangement Specifier: 0x20
	Registers read: p7 p1 p3 w13
	Groups: HasSVE2p1_or_HasSME

FinnWilkinson · 2024-05-21T09:12:51Z

Hi @Rot127 , sorry for the delayed reply I've only just seen the newest comments. This generallt all looks good to me, but I have some concerns about predicates in HasSME instructions. I assume only predicates in instructions with the group HasSME are of type sme.pred.reg? Even so, I think it is slightly confusing. For example, in psel there is no reference to an SME operand, yet all predicates are now sme.pred.reg even when the predicate is not indexed. A possible solution could be to add a variable to cs_aarch_op called index which any operand can have:

typedef struct aarch64_op_index {
  aarch64_reg base;
  int32_t imm_offset;
} aarch64_op_index;

This could also be used by NEON, SVE, and predicate registers and remove the need for vector_index. I'm sure this change in non-trivial though (and adds to current logic which isn't ideal), but could tidy things up a bit?

Additionally, just to ensure functional correctness, what do the following instructions get disassembled as?

fmopa za2.s, p0/m, p2/m, z3.s, z4.s ==== 40208180
smstart ==== 7f4703d5
zero {za0.s, za2.s} ==== 550008c0
- (za0.h is a valid alias for za0.s and za2.s)

Rot127 · 2024-05-21T09:37:27Z

A possible solution could be to add a variable to cs_aarch_op called index which any operand can have:

The thing is that these indices differ quite a lot (the matrix indices can have immediate ranges. And additionally the terminology differs between the members of the indices structs (slice, vector select etc.)). Would you be ok with:

Removing the SME operand
Adding an Index struct as suggested with: index_type (None, VectorPredicate, Matrix or something) + a union with the different indices structs.

I'd like to have the terminology of the struct fields relatively close to the ISA, if possible
Hence not just one index struct, but multiple.

Generally though I would follow your advice. I only interact with AArch64 in the Capstone context. And because you seem to use it heavily in practice, I would follow your design in the end.

I'm sure this change in non-trivial though (and adds to current logic which isn't ideal), but could tidy things up a bit?

As long as we don't change it after the v6 release, it's ok.
v6 will be huge, and although it is annoying to have these API changes here in next, it is important that it is done right before v6. And next is still a development branch in the end.

Additionally, just to ensure functional correctness, what do the following instructions get disassembled as?

Thanks for the test cases! As it looks like the alias versions are still miss some operands.
Please let me know any other errors.

If you have more test cases, please let me know. Those operands are pretty complex.

Details

With ALIAS operand set

cstool -d aarch64 402081807f4703d5550008c0
 0  40 20 81 80  fmopa	za0.s, p0/m, p1/m, z2.s, z1.s
	ID: 417 (fmopa)
	op_count: 5
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.mx.tile: za0.s
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: SME_PRED
		operands[1].sme.pred.reg: p0
		operands[1].access: READ
		operands[2].type: SME_PRED
		operands[2].sme.pred.reg: p1
		operands[2].access: READ
		operands[3].type: REG = z2
		operands[3].access: READ
			Vector Arrangement Specifier: 0x20
		operands[4].type: REG = z1
		operands[4].access: READ
			Vector Arrangement Specifier: 0x20
	Write-back: True
	Registers read: za0.s p0 p1 z2 z1
	Groups: HasSME 

 4  7f 47 03 d5  smstart	
	ID: 734 (msr)
	Is alias: 1468 (smstart) with ALIAS operand set
	Update-flags: True
	Registers modified: nzcv
	Groups: privilege 

 8  55 00 08 c0  zero	{za0.h}
	ID: 1384 (zero)
	Is alias: 1470 (zero) with ALIAS operand set
	Groups: HasSME

With REAL operand set

04:16 $ cstool -r -d aarch64 402081807f4703d5550008c0
 0  40 20 81 80  fmopa	za0.s, p0/m, p1/m, z2.s, z1.s
	ID: 417 (fmopa)
	op_count: 5
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.mx.tile: za0.s
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: SME_PRED
		operands[1].sme.pred.reg: p0
		operands[1].access: READ
		operands[2].type: SME_PRED
		operands[2].sme.pred.reg: p1
		operands[2].access: READ
		operands[3].type: REG = z2
		operands[3].access: READ
			Vector Arrangement Specifier: 0x20
		operands[4].type: REG = z1
		operands[4].access: READ
			Vector Arrangement Specifier: 0x20
	Write-back: True
	Registers read: za0.s p0 p1 z2 z1
	Groups: HasSME 

 4  7f 47 03 d5  smstart	
	ID: 734 (msr)
	Is alias: 1468 (smstart) with REAL operand set
	op_count: 2
		operands[0].type: SYS ALIAS:
			operands[0].svcr: BIT = SM & ZA
		operands[1].type: IMM = 0x1
		operands[1].access: READ
	Update-flags: True
	Registers modified: nzcv
	Groups: privilege 

 8  55 00 08 c0  zero	{za0.h}
	ID: 1384 (zero)
	Is alias: 1470 (zero) with REAL operand set
	op_count: 1
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.mx.tile: za6.d
		operands[0].access: READ
			Vector Arrangement Specifier: 0x40
	Registers read: za6.d
	Groups: HasSME

FinnWilkinson · 2024-05-21T09:54:20Z

I think the current implementation of SME operands (non predicates) works well, so I'd be opposed to changing from this. How about having a new aarch64_op_pred type for all predicates (not just HasSME)? This leaves vector and SME indexing as is, and would make indexed predicates a tad easier IMO without too much overhead for normal, non-indexed, predicate registers.

For the test case outputs:

If smstart ALIAS version printed the same as REAL then that would be helpful
zero ALIAS is missing the register, and REAL looks to select the incorrect register (this isn't a valid alias as far as I know)

I'll compile some more test cases now and put them here, but I need to formulate the hex first.

Rot127 · 2024-05-21T10:15:43Z

I'll compile some more test cases now and put them here, but I need to formulate the hex first.

This would be great. Thank you!

How about having a new aarch64_op_pred type for all predicates (not just HasSME)?

Ok, sounds good to me. To summarize:

Add new operand type aarch64_op_pred and aarch64_op_index.

typedef struct aarch64_op_index {
  aarch64_reg base;
  int32_t imm_offset;
} aarch64_op_index;

typedef struct aarch64_op_pred {
  aarch64_reg pred;
  aarch64_op_index index;
} aarch64_op_pred;

Operands in AArch64 update to LLVM 18 #2298 which are sme.pred, get converted to the new aarch64_op_pred format.
Leave sme untouched (meaning: in AArch64 update to LLVM 18 #2298 change it back to the previous sme operand, instead of sme.mx).

FinnWilkinson · 2024-05-21T10:37:04Z

Will aarch64_op_index be used for anything else? If not couldn't we just do

typedef struct aarch64_op_pred {
  aarch64_reg pred;
  aarch64_reg vec_select;
  int32_t im_offset;
} aarch64_op_pred;

?

And for sme instructions yeah, just removing the .mx part to yield (for example)

                ...
		operands[0].type: SME
		operands[0].sme.type: 2
		operands[0].sme.tile: za0.s
		operands[0].sme.slice_reg: w12
		operands[0].sme.slice_offset: 0
		operands[0].sme.is_vertical: false
                ...

seems good to me.

Rot127 · 2024-05-21T13:12:56Z

Will aarch64_op_index be used for anything else? If not couldn't we just do

Ah, well. Yes. Let's just do it as you said.
I wasn't paying attention and thought about the SME matrix as well. But let's just hope ARM will not introduce more extensions with this index pattern.

FinnWilkinson · 2024-05-21T13:18:04Z

But let's just hope ARM will not introduce more extensions with this index pattern.

Yes, lets!

Here are some more complex assembly tests that would be good to validate as working. At the moment this is about as complex as SVE and SME gets (hex may be the wrong way round):

sdot za.s[w11, 2, vgx4], {z0.h-z3.h}, z5.h[2] ==== c155f802
movaz {z4.d-z7.d}, za.d[w8, 5, vgx4] ==== c0060ea4
luti2 {z0.s-z3.s}, zt0, z4[1] ==== c08da080
fmla za.h[w9, 0, vgx4], {z8.h-z11.h}, z0.h[0] ==== c110b100
fmlal za.s[w10, 2:3, vgx4], {z0.h-z3.h}, z11.h[1] ==== c19bd005

Rot127 · 2024-05-22T13:51:20Z

Change is done. Also fixed, that the predicate regs were not added to the register written list.

Details

> cstool -d aarch64 c0089fe04131a2e067447125

 0  c0 08 9f e0  ld1w	{za0h.s[w12, 0]}, p2/z, [x6]
	ID: 474 (ld1w)
	Is alias: 1466 (ld1w) with ALIAS operand set
	op_count: 3
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za0.s
		operands[0].sme.slice_reg: w12
		operands[0].sme.slice_offset: 0
		operands[0].sme.is_vertical: false
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: PREDICATE
		operands[1].pred.reg: p2
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x6
		operands[2].access: READ
	Registers read: za0.s w12 p2 x6
	Groups: HasSME 

 4  41 31 a2 e0  st1w	{za0h.s[w13, 1]}, p4, [x10, x2, lsl #2]
	ID: 1099 (st1w)
	op_count: 3
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za0.s
		operands[0].sme.slice_reg: w13
		operands[0].sme.slice_offset: 1
		operands[0].sme.is_vertical: false
		operands[0].access: READ
			Vector Arrangement Specifier: 0x20
		operands[1].type: PREDICATE
		operands[1].pred.reg: p4
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x10
			operands[2].mem.index: REG = x2
		operands[2].access: WRITE
			Shift: type = 1, value = 2
	Registers read: za0.s w13 p4 x10 x2
	Groups: HasSME 

 8  67 44 71 25  psel	p7, p1, p3.s[w13, 1]
	ID: 785 (psel)
	op_count: 3
		operands[0].type: PREDICATE
		operands[0].pred.reg: p7
		operands[0].access: WRITE
		operands[1].type: PREDICATE
		operands[1].pred.reg: p1
		operands[1].access: READ
		operands[2].type: PREDICATE
		operands[2].pred.reg: p3
		operands[2].pred.vec_select: w13
		operands[2].pred.imm_index: 1
		operands[2].access: READ
			Vector Arrangement Specifier: 0x20
	Registers read: p1 p3 w13
	Registers modified: p7
	Groups: HasSVE2p1_or_HasSME

If smstart ALIAS version printed the same as REAL then that would be helpful

Sorry, late night working. Got confused with my own implementation.
This behavior (alias not printing the operands) is on purpose. cstool is supposed to print only the details which
are part of the asm text (this is forced by the design of Capstone).

The real operand set can always be accessed, either by adding the -r flag to cstool or by enabling the
CS_OPT_DETAIL_REAL option.

The reason is: to retrieve both detail sets, an instruction must be disassembled twice.
And this is a choice only the user should make. So we don't double the runtime by default.

zero ALIAS is missing the register, and REAL looks to select the incorrect register (this isn't a valid alias as far as I know)

Fix this later this week.

The new test cases

Output is below. Note the register lists. I didn't add a new operand type for those.
If you say it would be a huge improvement to have them in a separated struct, we can talk about the implementation.
Otherwise, I'd like to let the user deduct the registers in between the first and last register.
Just to keep it somewhat simple.

Same for the vgx4 indicators. They are not stored in the details, because it can be deducted by checking the list members.

Details

 > cstool -d aarch64be c155f802c0060ea4c08da080c110b100c19bd005

 0  c1 55 f8 02  sdot	za.s[w11, 2, vgx4], { z0.h - z3.h }, z5.h[2]
	ID: 922 (sdot)
	op_count: 4
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w11
		operands[0].sme.slice_offset: 2
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z0
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z3
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z5
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 2
	Write-back: True
	Registers read: za w11 z0 z3 z5
	Groups: HasSME2 

 4  c0 06 0e a4  movaz	{ z4.d - z7.d }, za.d[w8, 5, vgx4]
	ID: 724 (movaz)
	op_count: 3
		operands[0].type: REG = z4
		operands[0].is_list_member: true
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[1].type: REG = z7
		operands[1].is_list_member: true
		operands[1].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[2].type: SME_MATRIX
		operands[2].sme.type: 2
		operands[2].sme.tile: za
		operands[2].sme.slice_reg: w8
		operands[2].sme.slice_offset: 5
		operands[2].sme.is_vertical: false
		operands[2].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
	Write-back: True
	Registers read: za w8
	Registers modified: z4 z7
	Groups: HasSME2p1 

 8  c0 8d a0 80  luti2	{ z0.s - z3.s }, zt0, z4[1]
	ID: 710 (luti2)
	op_count: 4
		operands[0].type: REG = z0
		operands[0].is_list_member: true
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z3
		operands[1].is_list_member: true
		operands[1].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[2].type: SME_MATRIX
		operands[2].sme.type: 1
		operands[2].sme.tile: zt0
		operands[2].access: READ
		operands[3].type: REG = z4
		operands[3].access: READ
			Vector Index: 1
	Registers read: zt0 z4
	Registers modified: z0 z3
	Groups: HasSME2 

 c  c1 10 b1 00  fmla	za.h[w9, 0, vgx4], { z8.h - z11.h }, z0.h[0]
	ID: 410 (fmla)
	op_count: 4
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w9
		operands[0].sme.slice_offset: 0
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x10
		operands[1].type: REG = z8
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z11
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z0
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 0
	Write-back: True
	Registers read: za w9 z8 z11 z0
	Groups: HasSME2p1 HasSMEF16F16 

10  c1 9b d0 05  fmlal	za.s[w10, 2:3, vgx4], { z0.h - z3.h }, z11.h[1]
	ID: 409 (fmlal)
	op_count: 4
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w10
		operands[0].sme.slice_offset: 2:3
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z0
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z3
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z11
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 1
	Write-back: True
	Registers read: za w10 z0 z3 z11
	Groups: HasSME2

FinnWilkinson · 2024-05-22T15:09:16Z

Change is done. Also fixed, that the predicate regs were not added to the register written list.

I think this looks great now. Thanks!

Sorry, late night working. Got confused with my own implementation.
This behavior (alias not printing the operands) is on purpose. cstool is supposed to print only the details which
are part of the asm text (this is forced by the design of Capstone).

Understood!

Note the register lists. I didn't add a new operand type for those.

Given you can have strided vector register lists as well as consecutive (although I'm struggling to find an example currently), I think it would be beneficial to print them individually for both clarity and for using Capstone as a tool inside other projects. Rather than a new structure though, could we not print them as seperate operands?
i,.e. luti2 { z0.s - z3.s }, zt0, z4[1] would be:

8  c0 8d a0 80  luti2	{ z0.s - z3.s }, zt0, z4[1]
	ID: 710 (luti2)
	op_count: 6
		operands[0].type: REG = z0
		operands[0].is_list_member: true
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z1
		operands[1].is_list_member: true
		operands[1].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[2].type: REG = z2
		operands[2].is_list_member: true
		operands[2].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[3].type: REG = z3
		operands[3].is_list_member: true
		operands[3].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[4].type: SME_MATRIX
		operands[4].sme.type: 1
		operands[4].sme.tile: zt0
		operands[4].access: READ
		operands[5].type: REG = z4
		operands[6].access: READ
			Vector Index: 1
	Registers read: zt0 z4
	Registers modified: z0 z1 z2 z3
	Groups: HasSME2

The SME2 spec also states that although { z0.s - z3.s } is preferred disassembly, there must also be support for { z0.s, z1.s, z2.s, z3.s }. We could change to this other disassembly format to make list operands clearer, but I prefer the simpler format (i.e. keep it as it is now).

Same for the vgx4 indicators.

I agree with not printing vgx4 etc, as the vg syntax is optional disassembly as per the spec and doesn't really add much to the understanding. Having it printed in the disassembly is enough IMO.

As an aside, should zt0 be classed as a normal register rahter than SME_MATRIX? Although only accessible in Streaming SVE mode, it is a seperate register to ZA. And from what I can tell from the spec, it is not indexable (yet...). If you think theres a good reason to keep zt0 as is though then I'm still ok with how its currently displayed.

FinnWilkinson · 2024-05-22T15:20:26Z

I've also just noticed that SMSTART currently has Registers modified: nzcv which is false (as far as I can tell from the spec). Only PSTATE is updated. Similar is true for SMSTOP.

https://developer.arm.com/documentation/ddi0602/2024-03/Base-Instructions/SMSTART--Enables-access-to-Streaming-SVE-mode-and-SME-architectural-state--an-alias-of-MSR--immediate--

Rot127 · 2024-05-29T07:44:35Z

The smstart is fixed. Also here the zero operands:

Real:

 8  55 00 08 c0  zero	{za0.h}
	ID: 1384 (zero)
	Is alias: 1470 (zero) with REAL operand set
	op_count: 5
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.tile: za0.d
		operands[0].access: READ
			Vector Arrangement Specifier: 0x40
		operands[1].type: SME_MATRIX
		operands[1].sme.type: 1
		operands[1].sme.tile: za2.d
		operands[1].access: READ
			Vector Arrangement Specifier: 0x40
		operands[2].type: SME_MATRIX
		operands[2].sme.type: 1
		operands[2].sme.tile: za4.d
		operands[2].access: READ
			Vector Arrangement Specifier: 0x40
		operands[3].type: SME_MATRIX
		operands[3].sme.type: 1
		operands[3].sme.tile: za6.d
		operands[3].access: READ
			Vector Arrangement Specifier: 0x40
	Registers read: za0.d za2.d za4.d za6.d
	Groups: HasSME

Alias:

 8  55 00 08 c0  zero	{za0.h}
	ID: 1384 (zero)
	Is alias: 1470 (zero) with ALIAS operand set
	op_count: 1
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.tile: za0.h
			Vector Arrangement Specifier: 0x10
	Registers read: za0.h
	Groups: HasSME

Register lists are now as you suggested.

Here the output of all instructions:

Details

cstool -d aarch64 "0x40,0x20,0x81,0x80,0x7f,0x47,0x03,0xd5,0x55,0x00,0x08,0xc0,0x02,0xf8,0x55,0xc1,0xa4,0x0e,0x06,0xc0,0x80,0xa0,0x8d,0xc0,0x00,0xb1,0x10,0xc1,0x05,0xd0,0x9b,0xc1"
 0  40 20 81 80  fmopa	za0.s, p0/m, p1/m, z2.s, z1.s
	ID: 417 (fmopa)
	op_count: 5
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.tile: za0.s
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: PREDICATE
		operands[1].pred.reg: p0
		operands[1].access: READ
		operands[2].type: PREDICATE
		operands[2].pred.reg: p1
		operands[2].access: READ
		operands[3].type: REG = z2
		operands[3].access: READ
			Vector Arrangement Specifier: 0x20
		operands[4].type: REG = z1
		operands[4].access: READ
			Vector Arrangement Specifier: 0x20
	Write-back: True
	Registers read: za0.s p0 p1 z2 z1
	Registers modified: za0.s
	Groups: HasSME 

 4  7f 47 03 d5  smstart	
	ID: 734 (msr)
	Is alias: 1468 (smstart) with ALIAS operand set
	Groups: privilege 

 8  55 00 08 c0  zero	{za0.h}
	ID: 1384 (zero)
	Is alias: 1470 (zero) with ALIAS operand set
	op_count: 1
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 1
		operands[0].sme.tile: za0.h
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x10
	Registers modified: za0.h
	Groups: HasSME 

 c  02 f8 55 c1  sdot	za.s[w11, 2, vgx4], { z0.h - z3.h }, z5.h[2]
	ID: 922 (sdot)
	op_count: 6
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w11
		operands[0].sme.slice_offset: 2
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z0
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z1
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z2
		operands[3].is_list_member: true
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
		operands[4].type: REG = z3
		operands[4].is_list_member: true
		operands[4].access: READ
			Vector Arrangement Specifier: 0x10
		operands[5].type: REG = z5
		operands[5].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 2
	Write-back: True
	Registers read: za w11 z0 z1 z2 z3 z5
	Registers modified: za
	Groups: HasSME2 

10  a4 0e 06 c0  movaz	{ z4.d - z7.d }, za.d[w8, 5, vgx4]
	ID: 724 (movaz)
	op_count: 5
		operands[0].type: REG = z4
		operands[0].is_list_member: true
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[1].type: REG = z5
		operands[1].is_list_member: true
		operands[1].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[2].type: REG = z6
		operands[2].is_list_member: true
		operands[2].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[3].type: REG = z7
		operands[3].is_list_member: true
		operands[3].access: WRITE
			Vector Arrangement Specifier: 0x40
		operands[4].type: SME_MATRIX
		operands[4].sme.type: 2
		operands[4].sme.tile: za
		operands[4].sme.slice_reg: w8
		operands[4].sme.slice_offset: 5
		operands[4].sme.is_vertical: false
		operands[4].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
	Write-back: True
	Registers read: za w8
	Registers modified: z4 z5 z6 z7 za
	Groups: HasSME2p1 

14  80 a0 8d c0  luti2	{ z0.s - z3.s }, zt0, z4[1]
	ID: 710 (luti2)
	op_count: 6
		operands[0].type: REG = z0
		operands[0].is_list_member: true
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z1
		operands[1].is_list_member: true
		operands[1].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[2].type: REG = z2
		operands[2].is_list_member: true
		operands[2].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[3].type: REG = z3
		operands[3].is_list_member: true
		operands[3].access: WRITE
			Vector Arrangement Specifier: 0x20
		operands[4].type: REG = zt0
		operands[4].access: READ
		operands[5].type: REG = z4
		operands[5].access: READ
			Vector Index: 1
	Registers read: zt0 z4
	Registers modified: z0 z1 z2 z3
	Groups: HasSME2 

18  00 b1 10 c1  fmla	za.h[w9, 0, vgx4], { z8.h - z11.h }, z0.h[0]
	ID: 410 (fmla)
	op_count: 6
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w9
		operands[0].sme.slice_offset: 0
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x10
		operands[1].type: REG = z8
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z9
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z10
		operands[3].is_list_member: true
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
		operands[4].type: REG = z11
		operands[4].is_list_member: true
		operands[4].access: READ
			Vector Arrangement Specifier: 0x10
		operands[5].type: REG = z0
		operands[5].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 0
	Write-back: True
	Registers read: za w9 z8 z9 z10 z11 z0
	Registers modified: za
	Groups: HasSME2p1 HasSMEF16F16 

1c  05 d0 9b c1  fmlal	za.s[w10, 2:3, vgx4], { z0.h - z3.h }, z11.h[1]
	ID: 409 (fmlal)
	op_count: 6
		operands[0].type: SME_MATRIX
		operands[0].sme.type: 2
		operands[0].sme.tile: za
		operands[0].sme.slice_reg: w10
		operands[0].sme.slice_offset: 2:3
		operands[0].sme.is_vertical: false
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x20
		operands[1].type: REG = z0
		operands[1].is_list_member: true
		operands[1].access: READ
			Vector Arrangement Specifier: 0x10
		operands[2].type: REG = z1
		operands[2].is_list_member: true
		operands[2].access: READ
			Vector Arrangement Specifier: 0x10
		operands[3].type: REG = z2
		operands[3].is_list_member: true
		operands[3].access: READ
			Vector Arrangement Specifier: 0x10
		operands[4].type: REG = z3
		operands[4].is_list_member: true
		operands[4].access: READ
			Vector Arrangement Specifier: 0x10
		operands[5].type: REG = z11
		operands[5].access: READ
			Vector Arrangement Specifier: 0x10
			Vector Index: 1
	Write-back: True
	Registers read: za w10 z0 z1 z2 z3 z11
	Registers modified: za
	Groups: HasSME2

I'll add all the instructions from here as tests now ans start fuzzing afterwards. If you still find something, please let me know.

Thanks for helping out btw. It is difficult to get everything right in detail if you have to handle 3+ architectures with extensions. It's easy to miss things if one doesn't work with them all the time.

FinnWilkinson · 2024-05-29T10:39:13Z

All the above look great, thanks very mcuh for all your hard work on this! I'm looking forward to it being merged into next.

If I spot anything further I'll be sure to let you know / open new issues.

* Run clang-format * Remove arm.h header from AArch64 files * Update all AArch64 module files to LLVM-18. * Add check if the differs save file is up-to-date with the current files. * Add new generator for MC test trnaslation. * Fix warnings * Update generated AsmWriter files * Remove unused variable * Change MCPhysReg type to int16_t as LLVM 18 dictates. With LLVM 18 the MCPhysReg value's type is changed to int16_t. If we update modules to LLVM 18, they will generate compiler warnings that uint16_t* should not be casted to int16_t*. This makes changing the all tables to int16_t necessary, because the alternative is to duplicate all MCPhysReg related code. Which is even worse. * Assign enum values to raw_struct member * Add printAdrAdrpLabel def * Add header to regression test files. * Write files to build dir and ignore more parsing errors. * Fix parsing of MC test files. * Reset parser after every block * Add write and patch header step. * Add and update MC tests for AArch64 * Fix clang-tidy warnings * Don't warn about padding issues. They break automatically initialized structs we can not change easily. * Fix: Incorrect access of LLVM instruction descriptions. * Initialize DecoderComplete flag * Add more mapping and flag details * Add function to get MCInstDesc from table * Fix incorrect memory operand access types. * Fix test where memory was not written, ut only read. * Attempt to fix Windows build * Fix 2268 The enum values were different and hence lead to different decoding. * Refactor SME operands. - Splits SME operands in Matrix and Predicate operands. - Fixes general problems of incorrect detections with the vector select/index operands of predicate registers. - Simplifies code. * Fix up typo in WRITE * Print actual path to struct fields * Add Registers of SME operands to the reg-read list * Add tests for SME operands. * Use Capstone reg enum for comparison * Fix tests: 'Vector arra...' to 'operands[x].vas' * Add the developer fuzz option. * Fix Python bindings for SME operands * Fix variable shadowing. * Fix clang-tidy warnings * Add missing break. * Fix varg usage * Brackets for case * Handle AArch64_OP_GROUP_AdrAdrpLabel * Fix endian issue with fuzzing start bytes * Move previous sme.pred to it's own operand type. * Fix calculation for imm ranges * Print list member flag * Fix up operand strings for cstest * Do only a shallow clone of the cmocka stable branch * Fix: Don't categorize ZT0 as a SME matrix operand. * Remove unused code. * Add flag to distinguish Vn and Qn registers. * Add all registers to detail struct, even if emitted in the asm text * Fix: Increment op count after each list member is added. * Remove implicit write to NZCV for MSR Imm instructions. * Handle several alias operands. * Add details for zero alias with za0.h * Add SME tile to write list if written * Add write access flags to operands which are zeroed. * Add SME tests of #2285 * Fix tests with latest syntax changes. * Fix segfault if memory operand is only a label without register. * Fix python bindings * Attempt to fix clang-tidy warning for some configurations. * Add missing test file (accidentially blocked by gitignore.) * Print clang-tidy version before linting. * Update differ save file * Formatting * Use clang-tidy-15 as if possible. * Remove search patterns for MC tests, since they need to be reworked anyways. * Enum to upper case change * Add information to read the OSS fuzz result. * Fix special case of SVE2 operands. Apparently ZT0 registers can an index attached, get which is BOUND to it. We have no "index for reg" field. So it is simply saved as an immediate. * Handle LLVM expressions without asserts. * Ensure choices are always saved. * OP_GROUP enums can't be all upper case because they contain type information. * Fix compatibility header patching * Update saved_choices.json * Allow mode == None in test_corpus

Rot127 mentioned this issue Mar 9, 2024

AArch64 missing details tasks #2196

Open

5 tasks

Rot127 added this to the v6 milestone Mar 19, 2024

Rot127 added bug Something is not working as it should AArch64 Arch labels Mar 19, 2024

Rot127 added this to Capstone V6 Plan Mar 20, 2024

Rot127 moved this to Todo in Capstone V6 Plan Mar 20, 2024

Rot127 mentioned this issue Mar 25, 2024

AArch64 update to LLVM 18 #2298

Merged

6 tasks

Rot127 added a commit to Rot127/capstone that referenced this issue May 29, 2024

Add SME tests of capstone-engine#2285

a89fad0

Rot127 moved this from Todo to In Progress in Capstone V6 Plan May 29, 2024

Rot127 added a commit to Rot127/capstone that referenced this issue Jun 4, 2024

Add SME tests of capstone-engine#2285

018c13b

Rot127 added a commit to Rot127/capstone that referenced this issue Jun 19, 2024

Add SME tests of capstone-engine#2285

422fd45

Rot127 added a commit to Rot127/capstone that referenced this issue Jun 24, 2024

Add SME tests of capstone-engine#2285

0a10168

Rot127 added a commit to Rot127/capstone that referenced this issue Jun 26, 2024

Add SME tests of capstone-engine#2285

41602a1

Rot127 added a commit to Rot127/capstone that referenced this issue Jul 4, 2024

Add SME tests of capstone-engine#2285

5393e0e

kabeor closed this as completed in #2298 Jul 8, 2024

github-project-automation bot moved this from In Progress to Done in Capstone V6 Plan Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] SME instructions with indexed operands do not have correct disassembly information #2285

[AArch64] SME instructions with indexed operands do not have correct disassembly information #2285

FinnWilkinson commented Mar 7, 2024 •

edited

Loading

FinnWilkinson commented Mar 7, 2024 •

edited

Loading

FinnWilkinson commented Mar 8, 2024 •

edited

Loading

Rot127 commented Mar 9, 2024

FinnWilkinson commented Mar 11, 2024

Rot127 commented May 16, 2024 •

edited

Loading

Rot127 commented May 20, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024

Rot127 commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024

Rot127 commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024 •

edited

Loading

Rot127 commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024 •

edited

Loading

Rot127 commented May 22, 2024

FinnWilkinson commented May 22, 2024 •

edited

Loading

FinnWilkinson commented May 22, 2024 •

edited

Loading

Rot127 commented May 29, 2024

FinnWilkinson commented May 29, 2024

[AArch64] SME instructions with indexed operands do not have correct disassembly information #2285

[AArch64] SME instructions with indexed operands do not have correct disassembly information #2285

Comments

FinnWilkinson commented Mar 7, 2024 • edited Loading

FinnWilkinson commented Mar 7, 2024 • edited Loading

FinnWilkinson commented Mar 8, 2024 • edited Loading

Rot127 commented Mar 9, 2024

FinnWilkinson commented Mar 11, 2024

Rot127 commented May 16, 2024 • edited Loading

Rot127 commented May 20, 2024 • edited Loading

FinnWilkinson commented May 21, 2024

Rot127 commented May 21, 2024 • edited Loading

FinnWilkinson commented May 21, 2024

Rot127 commented May 21, 2024 • edited Loading

FinnWilkinson commented May 21, 2024 • edited Loading

Rot127 commented May 21, 2024 • edited Loading

FinnWilkinson commented May 21, 2024 • edited Loading

Rot127 commented May 22, 2024

FinnWilkinson commented May 22, 2024 • edited Loading

FinnWilkinson commented May 22, 2024 • edited Loading

Rot127 commented May 29, 2024

FinnWilkinson commented May 29, 2024

FinnWilkinson commented Mar 7, 2024 •

edited

Loading

FinnWilkinson commented Mar 7, 2024 •

edited

Loading

FinnWilkinson commented Mar 8, 2024 •

edited

Loading

Rot127 commented May 16, 2024 •

edited

Loading

Rot127 commented May 20, 2024 •

edited

Loading

Rot127 commented May 21, 2024 •

edited

Loading

Rot127 commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024 •

edited

Loading

Rot127 commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 21, 2024 •

edited

Loading

FinnWilkinson commented May 22, 2024 •

edited

Loading

FinnWilkinson commented May 22, 2024 •

edited

Loading