-
Notifications
You must be signed in to change notification settings - Fork 287
add vldx neon instructions #1200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @Amanieu (rust-highfive has picked a reviewer for you, use r? to override) |
crates/stdarch-gen/src/main.rs
Outdated
"unadjusted" | ||
} else { | ||
"C" | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would want to always use "unadjusted"
when linking with LLVM intrinsics, unless I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these links depend on "C" Abi, especially other architectures, like x86. In view of the fact that most of the structs are marked by #[repr(simd)]
, the difference in Abi should only affect the structs containing multiple vectors IMO. Let's try it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "unadjusted"
looks fine, at least for arm/aarch64. It also slightly simplifies the compilation process, so I think it makes sense.
A few assert_instr
s of some instructions using simd_extract
need to be correct, I think this may be due to some updates to the nightly compiler
@@ -2867,7 +3117,7 @@ pub unsafe fn vpmax_f32(a: float32x2_t, b: float32x2_t) -> float32x2_t { | |||
#[target_feature(enable = "neon")] | |||
#[cfg_attr(target_arch = "arm", target_feature(enable = "v7"))] | |||
#[rustc_legacy_const_generics(1)] | |||
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("vmov.32", IMM5 = 1))] | |||
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("vmov", IMM5 = 1))] | |||
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(mov, IMM5 = 1))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for these we should just use assert_instr(nop, ...)
. nop
is handled specially and accepts any instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. We should not care about the specific implementation of simd_extract
here. So I replaced with "nop"
for vget
instructions.
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("nop", IMM5 = 1))] | ||
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(nop, IMM5 = 1))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("nop", IMM5 = 1))] | |
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(nop, IMM5 = 1))] | |
#[cfg_attr(test, assert_instr(nop, IMM5 = 1))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have made the corresponding changes.
I think I can submit a new PR later to correct all assert_instr
in arm/aarch64 mod.
This PR adds vld1 neon intruscions which read multiple vectors at once.
Fixes #1143