Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM ERROR when test shift and insert methods in aarch64 #1017

Closed
SparrowLii opened this issue Feb 26, 2021 · 7 comments · Fixed by #1027
Closed

LLVM ERROR when test shift and insert methods in aarch64 #1017

SparrowLii opened this issue Feb 26, 2021 · 7 comments · Fixed by #1027

Comments

@SparrowLii
Copy link
Member

SparrowLii commented Feb 26, 2021

When I do the following test in crates/core_arch/src/arm/neon/shift_and_insert_tests.rs#L33 on my aarch64:

macro_rules! test_vsli {
    ($test_id:ident, $t:ty => $fn_id:ident ([$($a:expr),*], [$($b:expr),*], $n:expr)) => {
        #[simd_test(enable = "neon")]
        #[allow(unused_assignments)]
        unsafe fn $test_id() {
            let a = [$($a as $t),*];
            let b = [$($b as $t),*];
            let n_bit_mask: $t = (1 << $n) - 1;
            let e = [$(($a as $t & n_bit_mask) | ($b as $t << $n)),*];
            let r = $fn_id(transmute(a), transmute(b), $n);
            let mut d = e;
            d = transmute(r);
            assert_eq!(d, e);
        }
    }
}
test_vsli!(test_vsli_n_s8, i8 => vsli_n_s8([3, -44, 127, -56, 0, 24, -97, 10], [-128, -14, 125, -77, 27, 8, -1, 110], 5));

It has the following error:

LLVM ERROR: Cannot select: 0xffff800fb768: v8i8 = AArch64ISD::VSLI 0xffff800fb698, 0xffff80100118, 0xffff800fb5c8, crates/core_arch/src/aarch64/neon/mod.rs:2324:5
0xffff800fb698: v8i8,ch = load<(dereferenceable load 8 from %ir.1)> 0xffff80008188, 0xffff80100458, undef:i64, crates/core_arch/src/aarch64/neon/mod.rs:2324:16
0xffff80100458: i64,ch = CopyFromReg 0xffff80008188, Register:i64 %3, crates/core_arch/src/aarch64/neon/mod.rs:2324:16
0xffff800fff78: i64 = Register %3
0xffff801001e8: i64 = undef
0xffff80100118: v8i8,ch = load<(dereferenceable load 8 from %ir.2)> 0xffff80008188, 0xffff80100798, undef:i64, crates/core_arch/src/aarch64/neon/mod.rs:2324:19
0xffff80100798: i64,ch = CopyFromReg 0xffff80008188, Register:i64 %4, crates/core_arch/src/aarch64/neon/mod.rs:2324:19
0xffff80100180: i64 = Register %4
0xffff801001e8: i64 = undef
0xffff800fb5c8: i32,ch = load<(dereferenceable load 4 from %ir.10)> 0xffff80008188, FrameIndex:i64<5>, undef:i64, crates/core_arch/src/aarch64/neon/mod.rs:2324:22
0xffff800fb838: i64 = FrameIndex<5>
0xffff801001e8: i64 = undef
In function: _ZN9core_arch9core_arch7aarch644neon9vsli_n_s817ha2edb5572e851af2E
error: could not compile `core_arch`

The other methods in the test file will also have the same error. I am not sure if this is caused by llvm or something went wrong elsewhere

@Amanieu
Copy link
Member

Amanieu commented Feb 26, 2021

It is most likely a bug in stdarch. Compare the LLVM IR generated by rustc with the IR generated by Clang for the equivalent C code.

@Amanieu
Copy link
Member

Amanieu commented Feb 26, 2021

Oh actually I know what the problem is. The aarch64 intrinsics in stdarch currently only work in release builds because it is missing a constify! to handle the constants.

This is in the process of being fixed as we are transitioning to using const generics in stdarch.

@SparrowLii
Copy link
Member Author

Thanks for the explanation :) Is there a link to related work progress? I want to see if I can help

@Amanieu
Copy link
Member

Amanieu commented Feb 26, 2021

The issue is #248, it has been open since 2017. At the moment we use constify_*! macros in stdarch to work around this, but this has only been done for x86, not other architectures.

Initial work on solving this issue properly has started in rust-lang/rust#82447.

@SparrowLii
Copy link
Member Author

Thanks for the link. It looks like I can’t guarantee to fix it, but I will try my best :)

@SparrowLii
Copy link
Member Author

SparrowLii commented Feb 27, 2021

[Edit] I just found out that vsri cannot accept n==0, so I think I need to rewrite a corresponding constify_* macro

I just ran into some very weird problems:
for vsli_n_s8 method in aarch64/neon.mod.rs, I transformed it into the following writing:

pub unsafe fn vsli_n_s8(a: int8x8_t, b: int8x8_t, n: i32) -> int8x8_t {
    assert!(0 <= n && n <= 7, "must have 0 ≤ n ≤ 7, but n = {}", n);
    macro_rules! call {
        ($imm3:expr) => {
            vsli_n_s8_(a, b, $imm3)
        };
    }
    let r = constify_imm3!(n, call);
    r
}

It compiles and tests successfully, as are other vsli methods.
But for the vsri_n_s8 method, no matter if I write it this way

pub unsafe fn vsri_n_s8(a: int8x8_t, b: int8x8_t, n: i32) -> int8x8_t {
    assert!(1 <= n && n <= 8, "must have 1 ≤ n ≤ 8, but n = {}", n);
    macro_rules! call {
        ($imm3:expr) => {
            vsri_n_s8_(a, b, $imm3)
        };
    }
    if n == 8 {
        return vsri_n_s8_(a, b, 8)
    }
    let r = constify_imm3!(n, call);
    r
}

or this way

pub unsafe fn vsri_n_s8(a: int8x8_t, b: int8x8_t, n: i32) -> int8x8_t {
    assert!(1 <= n && n <= 8, "must have 1 ≤ n ≤ 8, but n = {}", n);
    match (n) & 0b111 {
        0 => vsri_n_s8_(a, b, 0),
        1 => vsri_n_s8_(a, b, 1),
        2 => vsri_n_s8_(a, b, 2),
        3 => vsri_n_s8_(a, b, 3),
        4 => vsri_n_s8_(a, b, 4),
        5 => vsri_n_s8_(a, b, 5),
        6 => vsri_n_s8_(a, b, 6),
        7 => vsri_n_s8_(a, b, 7),
        _ => vsri_n_s8_(a, b, 8),
    }
}

It still occurs LLVM ERROR:

LLVM ERROR: Cannot select: 0xffff7c10d2b0: v8i8 = AArch64ISD::VSRI 0xffff7c10c5b0, 0xffff7c111648, Constant:i32<0>, crates/core_arch/src/aarch64/neon/mod.rs:2637:18
0xffff7c10c5b0: v8i8,ch = load<(dereferenceable load 8 from %ir.1)> 0xffff7c025728, 0xffff7c10c548, undef:i64, crates/core_arch/src/aarch64/neon/mod.rs:2637:29
0xffff7c10c548: i64,ch = CopyFromReg 0xffff7c025728, Register:i64 %3, crates/core_arch/src/aarch64/neon/mod.rs:2637:29
0xffff7c10c820: i64 = Register %3
0xffff7c111238: i64 = undef
0xffff7c111648: v8i8,ch = load<(dereferenceable load 8 from %ir.2)> 0xffff7c025728, 0xffff7c10ce38, undef:i64, crates/core_arch/src/aarch64/neon/mod.rs:2637:32
0xffff7c10ce38: i64,ch = CopyFromReg 0xffff7c025728, Register:i64 %4, crates/core_arch/src/aarch64/neon/mod.rs:2637:32
0xffff7c111098: i64 = Register %4
0xffff7c111238: i64 = undef
0xffff7c10d110: i32 = Constant<0>
In function: _ZN9core_arch9core_arch7aarch644neon9vsri_n_s817h6e253b4f8a7dc1d7E
error: could not compile `core_arch`

But if I modify the first line of the match statement:

pub unsafe fn vsri_n_s8(a: int8x8_t, b: int8x8_t, n: i32) -> int8x8_t {
    assert!(1 <= n && n <= 8, "must have 1 ≤ n ≤ 8, but n = {}", n);
    match (n) & 0b111 {
        0 => vsli_n_s8_(a, b, 0),
        1 => vsri_n_s8_(a, b, 1),
        2 => vsri_n_s8_(a, b, 2),
        3 => vsri_n_s8_(a, b, 3),
        4 => vsri_n_s8_(a, b, 4),
        5 => vsri_n_s8_(a, b, 5),
        6 => vsri_n_s8_(a, b, 6),
        7 => vsri_n_s8_(a, b, 7),
        _ => vsri_n_s8_(a, b, 8),
    }
}

It will compile successfully and pass the test (the test value of n is 5).

@Amanieu
Copy link
Member

Amanieu commented Feb 27, 2021

Now that #1018 and #1019 have landed you can try refactoring these functions to use const generics instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants