-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv: P extension intrinsics for packed SIMD (part 1) #1332
Conversation
Some notes about the
Put them in a
Are these values produced in 2 registers? Define both registers as 32-bit outputs and the re-construct the 64-bit value in Rust code. |
We should probably be careful about this until llvm/llvm-project#57550 is fixed. I think it should be fine to use, so long as its not used twice in the same asm block though. |
I had a look at the spec and it seems some RV32 instructions work with register pairs. This is not currently supported by |
Thanks! I'll apply your suggestions about |
46a9fcd
to
7e892cc
Compare
Hello! I applied following suggestions:
I squashed and force pushed one commit for this pull request. r? @Amanieu |
a7cc2c2
to
e6bc261
Compare
pub fn add16(a: usize, b: usize) -> usize { | ||
let value: usize; | ||
unsafe { | ||
asm!(".insn r 0x77, 0x0, 0x20, {}, {}, {}", out(reg) value, in(reg) a, in(reg) b, options(pure, nomem, nostack)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asm!(".insn r 0x77, 0x0, 0x20, {}, {}, {}", out(reg) value, in(reg) a, in(reg) b, options(pure, nomem, nostack)) | |
asm!(".insn r 0x77, 0x0, 0x20, {}, {}, {}", lateout(reg) value, in(reg) a, in(reg) b, options(pure, nomem, nostack)) |
This needs to be lateout
so the register allocator can reuse the register used for a
or b
for the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Should all the instructions here use 'lateout' as well? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unless there is a restriction in the ISA that prevents reusing an input register for the output. I don't think that is the case here (I know a few ARM instructions have this restriction).
d923277
to
5fea831
Compare
Implement by inline assembly for now, uses `pure, nomem, nostack` for all packed simd arithmetic instructions. Uses `inlateout` when it requires using the same register for input and output, use `lateout` for all output registers. This commit also includes a rearrangement of shared risc-v architecture module to improve documents. It also includes a doc test fix, gate sm3/4 and use explict sm3/4 instruction under rustc target feature.
65559c6
to
f3ab77a
Compare
This pull reqeust includes intrinsics for RISC-V Packed SIMD extension. It includes:
Following instructions require work in further pull requests:
I implemented all intrinsic functions from P extension specification, this extension is useful as some RISC-V chips have already taped out with P extension. All intrinsics are implemented as inline assembly, for LLVM does not have these intrinsic functions by now; we can change to LLVM functions if it appears in future. This would be the first pull request as further works would be introduced in other pull requests.
Questions remain before further pull requests are:
riscv_shared
andriscv64
by now.Please review and give suggestions if modifications are needed, thanks!
r? @Amanieu