Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#2626 AArch64 encoder: Add isz operand and vector ADD to encoder. #3016

Merged
merged 5 commits into from
May 24, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions core/arch/aarch64/codec.c
Original file line number Diff line number Diff line change
Expand Up @@ -1966,6 +1966,26 @@ encode_opnd_s10(uint enc, int opcode, byte *pc, opnd_t opnd, OUT uint *enc_out)
return encode_opnd_vector_reg(10, 2, opnd, enc_out);
}

/* isz: Vector element width for SIMD instructions. */

static inline bool
decode_opnd_isz(uint enc, int opcode, byte *pc, OUT opnd_t *opnd)
{
uint bits = enc >> 22 & 3;
*opnd = opnd_create_immed_int(bits, OPSZ_2b);
return true;
}

static inline bool
encode_opnd_isz(uint enc, int opcode, byte *pc, opnd_t opnd, OUT uint *enc_out)
{
ptr_int_t val = opnd_get_immed_int(opnd);
if ( val < 0 || val > 3)
return false;
*enc_out = val << 22;
return true;
}

/* shift3: shift type for ADD/SUB: LSL, LSR or ASR */

static inline bool
Expand Down
5 changes: 5 additions & 0 deletions core/arch/aarch64/codec.txt
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@
---------?x---------x----------- vindex_SD # Index for vector with single or double
# elements, depending on bit 22 (sz)
?--------xx--------------------- imm16sh # shift for MOVK/... (immediate); checks 31
--------xx---------------------- isz # element size of a vector register (8<<isz)
--------xx---------------------- shift3 # shift type for add/sub (shifted register)
--------xx---------------------- shift4 # shift type for logical (shifted register)
??---?--xxxxxxxxxxxxxxxxxxx----- memlit # load literal, gets size from 31:30 and 26
Expand Down Expand Up @@ -957,6 +958,10 @@ x101101011000000000101xxxxxxxxxx cls wx0 : wx5
1101101011000000000011xxxxxxxxxx rev x0 : x5

# Data Processing - Scalar Floating-Point and Advanced SIMD

# ADD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a rule for how the instructions are ordered in this file? (I think I was following the "Index by Encoding" in our internal web pages at some point...) If it's feasible, it might be good to follow some canonical ordering and mark omissions with a comment. (But perhaps it isn't feasible.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The neon patterns should currently follow alphabetic order (as on the A64 -- SIMD and Floating-point Instructions (alphabetic order) index page of the public XML ISA spec). That's how the generator script happens to process them, but IMO that makes it easier to read. On second thought, it might be easier to extend to generator script to work by the index page in the future.

0x001110xx1xxxxx100001xxxxxxxxxx add dq0 : dq5 dq16 isz

# FMOV (general) GPR to FP reg
0001111011100111000000xxxxxxxxxx fmov h0 : w5 # Armv8.2
0001111000100111000000xxxxxxxxxx fmov s0 : w5
Expand Down
24 changes: 24 additions & 0 deletions core/arch/aarch64/instr_create.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,30 @@
/* DR_API EXPORT TOFILE dr_ir_macros_aarch64.h */
/* DR_API EXPORT BEGIN */

/**
* Used in an additional immediate source operand to a vector operation, denotes
* 8 bit vector element width. See \ref sec_IR_AArch64.
*/
#define ISZ_BYTE 0

/**
* Used in an additional immediate source operand to a vector operation, denotes
* 16 bit vector element width. See \ref sec_IR_AArch64.
*/
#define ISZ_HALF 1

/**
* Used in an additional immediate source operand to a vector operation, denotes
* 32 bit vector element width. See \ref sec_IR_AArch64.
*/
#define ISZ_SINGLE 2

/**
* Used in an additional immediate source operand to a vector operation, denotes
* 64 bit vector element width. See \ref sec_IR_AArch64.
*/
#define ISZ_DOUBLE 3

/**
* Used in an additional immediate source operand to a vector operation, denotes
* half-precision floating point vector elements. See \ref sec_IR_AArch64.
Expand Down
10 changes: 10 additions & 0 deletions suite/tests/api/dis-a64.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1561,6 +1561,16 @@ fd3fffff : str d31, [sp,#32760] : str %d31 -> +0x7ff8(%sp)[8byte]
fd481041 : ldr d1, [x2,#4128] : ldr +0x1020(%x2)[8byte] -> %d1
fd7fffff : ldr d31, [sp,#32760] : ldr +0x7ff8(%sp)[8byte] -> %d31


# ADD (vector)
4e2c856a : add v10.16b, v11.16b, v12.16b : add %q11 %q12 $0x00 -> %q10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script dis-a64.pl contains a format specification that would attempt to make these colons line up. In fact, I think at some point dis-a64.txt could survive reformatting by dis-a64.pl. Perhaps worth thinking about getting that to work again, but not as part of this commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I've aligned the ADD lines. It should be easy to update the generator script to align on a per-opcode basis.

0e2584a5 : add v5.8b, v5.8b, v5.8b : add %d5 %d5 $0x00 -> %d5
4e7f87c3 : add v3.8h, v30.8h, v31.8h : add %q30 %q31 $0x01 -> %q3
0e7f87c3 : add v3.4h, v30.4h, v31.4h : add %d30 %d31 $0x01 -> %d3
4ebd8633 : add v19.4s, v17.4s, v29.4s : add %q17 %q29 $0x02 -> %q19
0ebd8633 : add v19.2s, v17.2s, v29.2s : add %d17 %d29 $0x02 -> %d19
4ee9852d : add v13.2d, v9.2d, v9.2d : add %q9 %q9 $0x03 -> %q13

# FMOV (general) GPR to FP reg
1ee70220 : fmov h0, w17 : fmov %w17 -> %h0
1e27012a : fmov s10, w9 : fmov %w9 -> %s10
Expand Down