Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cranelift: SIMD icmp fails to compile on aarch64 backend #3335

Closed
afonso360 opened this issue Sep 11, 2021 · 6 comments · Fixed by #5530
Closed

Cranelift: SIMD icmp fails to compile on aarch64 backend #3335

afonso360 opened this issue Sep 11, 2021 · 6 comments · Fixed by #5530
Labels
bug Incorrect behavior in the current implementation that needs fixing cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator

Comments

@afonso360
Copy link
Contributor

Hey,

It looks like SIMD icmp is failing to compile on aarch64. We have a similar situation on #3334 for x64, but for different reasons. The test below is only for icmp eq, but this fails for any condition code.

.clif Test Case

test run
target aarch64

function %simd_icmp_eq_i8(i8x16, i8x16) -> b8x16 {
block0(v0: i8x16, v1: i8x16):
    v2 = icmp eq v0, v1
    return v2
}
; run: %simd_icmp_eq_i8([1 0 -1 1 1 1 1 1 1 1 1 1 1 1 1 1], [1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0]) == [true true true false false false false false false false false false false false false false]

function %simd_icmp_eq_i16(i16x8, i16x8) -> b16x8 {
block0(v0: i16x8, v1: i16x8):
    v2 = icmp eq v0, v1
    return v2
}
; run: %simd_icmp_eq_i16([1 0 -1 1 1 1 1 1], [1 0 -1 0 0 0 0 0]) == [true true true false false false false false]

function %simd_icmp_eq_i32(i32x4, i32x4) -> b32x4 {
block0(v0: i32x4, v1: i32x4):
    v2 = icmp eq v0, v1
    return v2
}
; run: %simd_icmp_eq_i32([1 0 -1 1], [1 0 -1 0]) == [true true true false]

function %simd_icmp_eq_i64(i64x2, i64x2) -> b64x2 {
block0(v0: i64x2, v1: i64x2):
    v2 = icmp eq v0, v1
    return v2
}
; run: %simd_icmp_eq_i64([10 0], [1 0]) == [false true]
; run: %simd_icmp_eq_i64([-1 1], [-1 0]) == [true false]

Steps to Reproduce

clif-util test ./the-above.clif

Expected Results

The test to pass

Actual Results

Running `qemu-aarch64 -L /usr/aarch64-linux-gnu -E LD_LIBRARY_PATH=/usr/aarch64-linux-gnu/lib /mnt/c/Users/Afonso/CLionProjects/wasmtime/target/aarch64-unknown-linux-gnu/debug/clif-util test ./filetests/filetests/runtes
ts/simd-icmp-eq.clif`
thread 'worker #0' panicked at 'assertion failed: `(left == right)`
  left: `V128`,
 right: `I64`', cranelift/codegen/src/isa/aarch64/inst/emit.rs:100:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
FAIL ./filetests/filetests/runtests/simd-icmp-eq.clif: panicked in worker #0: assertion failed: `(left == right)`
  left: `V128`,
 right: `I64`
1 tests
Error: 1 failure

Versions and Environment

Cranelift version or commit: main
Operating system: Linux
Architecture: aarch64 QEMU

Extra Info

A icmp SIMD test suite was added in #3332

@afonso360 afonso360 added bug Incorrect behavior in the current implementation that needs fixing cranelift Issues related to the Cranelift code generator labels Sep 11, 2021
@akirilov-arm akirilov-arm added the cranelift:area:aarch64 Issues related to AArch64 backend. label Sep 13, 2021
@akirilov-arm
Copy link
Contributor

This issue looks weird because clif-util compile --target arm64 with the same test case succeeds and the generated code looks fine. Is the problem in some layer above the backend that assumes that the Icmp result is a scalar integer?

@bjorn3
Copy link
Contributor

bjorn3 commented Sep 13, 2021

Could this issue be during compilation the generated run wrapper instead of the test function itself?

@afonso360
Copy link
Contributor Author

afonso360 commented Sep 13, 2021

Yes, it looks like this is an issue with the generated trampoline. It looks like in the run test machinery we compile the source function ok, but then fail when compiling the trampoline.

We're generating the following trampoline, which fails to compile (this is for x86_64, but i suspect the issue is the same for aarch64):

function u0:0(i64, i64) windows_fastcall {
    sig0 = (i64x2, i64x2) -> b64x2 windows_fastcall

block0(v0: i64, v1: i64):
    v2 = load.i64x2 notrap aligned v1
    v3 = load.i64x2 notrap aligned v1+16
    v4 = call_indirect sig0, v0(v2, v3)
    v5 = bint.i64 v4
    store notrap aligned v5, v1
    return
}

I'm going to dig a bit further into this.

@akirilov-arm
Copy link
Contributor

The Bint operation doesn't make sense because Bint can't convert from vector to scalar - the backends are right to complain.

@afonso360
Copy link
Contributor Author

The Bint operation doesn't make sense because Bint can't convert from vector to scalar - the backends are right to complain.

That's true, however this should probably be a verifier error. bint vector to scalar shouldn't be an accepted instruction.

Setting the correct type for bint doesen't fix the issue, since it isn't implemented for aarch64 (for simd values), but we can work around this by raw_bitcast'ing it into the appropriate integer type.

I'll open a PR with this. Thanks for looking into this, I didn't realize it wasn't a backend issue.

@akirilov-arm
Copy link
Contributor

... we can work around this by raw_bitcast'ing it into the appropriate integer type.

No, that's wrong (but it's probably going to work) - I don't think we agreed in #3205 to change the representation of vector booleans. This would be an appropriate work-around for Bmask (but then we have #1429).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior in the current implementation that needs fixing cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator
Projects
None yet
3 participants