-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#3700: Add IR for AArch64's SIMD LD1/ST1 instructions #3710
Conversation
These are the multiple single element structure load/stores from/to one vector register variants: ST1 { <Vt>.<T> }, [<Xn|SP>] LD1 { <Vt>.<T> }, [<Xn|SP>]
run arm tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some style nits. I did not look in the manual to check the encoding bits. Are there tests, either automated or manual pre-commit, to ensure this is the right encoding by comparing to some other decoder?
core/arch/aarch64/instr_create.h
Outdated
|
||
/** | ||
* Creates an Advanced SIMD (NEON) LD1 instruction to load multiple | ||
* single element structures to one vector register, e.g. LD1 {V0.4H},[X0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing trailing .
core/arch/aarch64/instr_create.h
Outdated
|
||
/** | ||
* Creates an Advanced SIMD (NEON) ST1 instruction to store multiple | ||
* single element structures from one vector register, e.g. ST1 {V1.2S},[X1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing trailing .
core/arch/aarch64/instr_create.h
Outdated
/* TODO: Remaining advanced SIMD (NEON) memory instructions: | ||
#define INSTR_CREATE_ld2/3/4_multi_2/3/4() | ||
#define INSTR_CREATE_ld1/2/3/4_single() | ||
and st1 equivalents including post-index variants */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Add the issue number, so TODO i#2626:
.
Also we prefer */
on its own line.
Great, thanks Assad! Did you use the scripts to auto-generate them? |
INSTR_CREATE_st1_multi_<n>() where <n> is 1, 2, 3 or 4 | ||
|
||
ST1 { <Vt>.<T> }, [<Xn|SP>] | ||
ST1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, only the 1 operand version is tested, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Hopefully more will follow as time allows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...good spot. Perhaps I should remove those comments as they could be misleading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. The header comments are useful placeholders and each test call is explicitly commented to make it clear it's the 1 operand version being run.
INSTR_CREATE_ld1_multi_<n>() where <n> is 1, 2, 3 or 4 | ||
|
||
LD1 { <Vt>.<T> }, [<Xn|SP>] | ||
LD1 { <Vt>.<T>, <Vt2>.<T> }, [<Xn|SP>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, only the 1 operand version is tested, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Hopefully more will follow as time allows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...good spot. Perhaps I should remove those comments as they could be misleading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. The header comments are useful placeholders and each test call is explicitly commented to make it clear it's the 1 operand version being run.
@@ -756,7 +756,7 @@ x0110111xxxxxxxxxxxxxxxxxxxxxxxx tbnz tbz | |||
0x001100000000000010xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 vt3 | |||
0x001100000000000100xxxxxxxxxxxx st3 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000110xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vmsz vt0 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vt0 vmsz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we flip the position of the vmsz operand here to be more in line with the XXX_sz operands, which come last?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Is there a reason you did not change it for ld1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Is there a reason you did not change it for ld1?
ld1 doesn't need to change as its fields are already correctly positioned w.r.t. encoding string:
0x001100010000000111xxxxxxxxxxxx ld1 vt0 : memvm vmsz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
In this case I've tested the instruction macros in a client which copies vectors between buffers. It's an application specific client. At some point I'll port it to a client targeted at testing which is safe to be upstreamed as part of the test suite. It can be the basis of further AArch64 SIMD work. |
No, unfortunately. There's still some issues and work required and resourcing is . . . variable ;-) |
@@ -74,7 +74,7 @@ | |||
0b9f13ff : add wzr, wzr, wzr, asr #4 : add %wzr %wzr asr $0x04 -> %wzr | |||
0c0007ff : st4 {v31.4h, v0.4h, v1.4h, v2.4h}, [sp]: st4 $0x01 %d31 %d0 %d1 %d2 -> (%sp)[32byte] | |||
0c0067ff : st1 {v31.4h, v0.4h, v1.4h}, [sp]: st1 $0x01 %d31 %d0 %d1 -> (%sp)[24byte] | |||
0c0077ff : st1 {v31.4h}, [sp] : st1 $0x01 %d31 -> (%sp)[8byte] | |||
0c0077ff : st1 {v31.4h}, [sp] : st1 %d31 $0x01 -> (%sp)[8byte] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add tests for all sizes? same for ld1.
@@ -335,6 +335,12 @@ reg_is_gpr(reg_id_t reg) | |||
return (DR_REG_X0 <= reg && reg <= DR_REG_WSP); | |||
} | |||
|
|||
bool | |||
reg_is_simd(reg_id_t reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added reg_is_simd()
in order for AArch64 to conform to the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would it be possible to add a test for it to https://github.com/DynamoRIO/dynamorio/blob/master/suite/tests/api/opnd-a64.c ?
@@ -756,7 +756,7 @@ x0110111xxxxxxxxxxxxxxxxxxxxxxxx tbnz tbz | |||
0x001100000000000010xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 vt3 | |||
0x001100000000000100xxxxxxxxxxxx st3 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000110xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vmsz vt0 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vt0 vmsz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Is there a reason you did not change it for ld1?
@@ -756,7 +756,7 @@ x0110111xxxxxxxxxxxxxxxxxxxxxxxx tbnz tbz | |||
0x001100000000000010xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 vt3 | |||
0x001100000000000100xxxxxxxxxxxx st3 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000110xxxxxxxxxxxx st1 memvm : vmsz vt0 vt1 vt2 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vmsz vt0 | |||
0x001100000000000111xxxxxxxxxxxx st1 memvm : vt0 vmsz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
@@ -335,6 +335,12 @@ reg_is_gpr(reg_id_t reg) | |||
return (DR_REG_X0 <= reg && reg <= DR_REG_WSP); | |||
} | |||
|
|||
bool | |||
reg_is_simd(reg_id_t reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would it be possible to add a test for it to https://github.com/DynamoRIO/dynamorio/blob/master/suite/tests/api/opnd-a64.c ?
bool | ||
reg_is_simd(reg_id_t reg) | ||
{ | ||
return (DR_REG_Q0 <= reg && reg <= DR_REG_B31); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit surprised that S
, H
and B
registers are included here. I thought the only vector registers used by instructions are the 128 bit Q
, as in v8.2d, v8.4s, v8.8h
and 64 bit D
as in v8.2s
, v8.4h
. Am I missing something and are there encodings that use S
, H
or B
registers as vector register?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My (simple) understanding is that all reg_is_simd()
is saying is whether a given register is SIMD or not, and B
/H
/S
/D
are components of a SIMD Q
register. That's independent of any encoding issues. Have I missed something?
Remember the 'Fixes #NNN' or 'Issue: #NNN' for future commits: this one had neither and so has no link to the issue. Can #3700 be closed now? |
Gah! My bad.
Not yet. There are still variants of the SIMD ld/st which need to be implemented and the test coverage increased in general. |
If you've run the devsetup script, it sets a commit message template which makes it easier to remember by pre-populating each commit message with the suggested form. |
Minor conflict in core/arch/aarch64/codec.txt due to PR: #3710
This is an internal implementation of a feature we would like to enable in upstream DynamoRIO. IMO proposing such a change upstream is more efficient if a working example is available. Once this is merged an IP clean proposal patch can be created which represents the interface we would like. This patch required an upstream change: #3710 Change-Id: Ifbb14e21083dc68d50609401032ddc2d40c9f7fc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/193795 Reviewed-by: Al Grant <al.grant@arm.com> Tested-by: mdc-bbot
These are the multiple single element structure load/stores
from/to one vector register variants: