Skip to content

add vldx neon instructions #1200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 24, 2021
Merged

add vldx neon instructions #1200

merged 7 commits into from
Aug 24, 2021

Conversation

SparrowLii
Copy link
Member

This PR adds vld1 neon intruscions which read multiple vectors at once.
Fixes #1143

@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

"unadjusted"
} else {
"C"
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want to always use "unadjusted" when linking with LLVM intrinsics, unless I'm missing something?

Copy link
Member Author

@SparrowLii SparrowLii Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these links depend on "C" Abi, especially other architectures, like x86. In view of the fact that most of the structs are marked by #[repr(simd)], the difference in Abi should only affect the structs containing multiple vectors IMO. Let's try it.

Copy link
Member Author

@SparrowLii SparrowLii Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "unadjusted" looks fine, at least for arm/aarch64. It also slightly simplifies the compilation process, so I think it makes sense.
A few assert_instrs of some instructions using simd_extract need to be correct, I think this may be due to some updates to the nightly compiler

@@ -2867,7 +3117,7 @@ pub unsafe fn vpmax_f32(a: float32x2_t, b: float32x2_t) -> float32x2_t {
#[target_feature(enable = "neon")]
#[cfg_attr(target_arch = "arm", target_feature(enable = "v7"))]
#[rustc_legacy_const_generics(1)]
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("vmov.32", IMM5 = 1))]
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("vmov", IMM5 = 1))]
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(mov, IMM5 = 1))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for these we should just use assert_instr(nop, ...). nop is handled specially and accepts any instruction.

Copy link
Member Author

@SparrowLii SparrowLii Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. We should not care about the specific implementation of simd_extract here. So I replaced with "nop" for vget instructions.

Comment on lines 3120 to 3121
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("nop", IMM5 = 1))]
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(nop, IMM5 = 1))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("nop", IMM5 = 1))]
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(nop, IMM5 = 1))]
#[cfg_attr(test, assert_instr(nop, IMM5 = 1))]

Copy link
Member Author

@SparrowLii SparrowLii Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have made the corresponding changes.
I think I can submit a new PR later to correct all assert_instr in arm/aarch64 mod.

@Amanieu Amanieu merged commit 6ddda5c into rust-lang:master Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't get the correct return type of llvm link
3 participants