Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoongArch] Miscompilation #116008

Closed
dtcxzyw opened this issue Nov 13, 2024 · 0 comments · Fixed by #116075
Closed

[LoongArch] Miscompilation #116008

dtcxzyw opened this issue Nov 13, 2024 · 0 comments · Fixed by #116075

Comments

@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 13, 2024

Reproducer: https://godbolt.org/z/G4c4Yvcrx

target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "loongarch64-unknown-linux-gnu"

@f = dso_local local_unnamed_addr global i32 5, align 4

; Function Attrs: nofree nounwind
define dso_local noundef signext i32 @main() {
entry:
  %1 = load i32, ptr @f, align 4
  %.fr = freeze i32 %1
  %10 = insertelement <4 x i32> poison, i32 %.fr, i64 0
  %11 = shufflevector <4 x i32> %10, <4 x i32> poison, <4 x i32> zeroinitializer
  %12 = icmp ugt <4 x i32> zeroinitializer, %11
  %13 = select <4 x i1> %12, <4 x i32> splat (i32 1), <4 x i32> %11
  %14 = icmp samesign ugt <4 x i32> %13, splat (i32 31)
  %15 = lshr <4 x i32> splat (i32 2147483647), %13
  %16 = icmp samesign ugt <4 x i32> %11, %15
  %17 = select <4 x i1> %14, <4 x i1> splat (i1 true), <4 x i1> %16
  %18 = select <4 x i1> %17, <4 x i32> zeroinitializer, <4 x i32> %13
  %19 = shl <4 x i32> %11, %18
  %20 = xor <4 x i32> %19, splat (i32 1)
  %21 = tail call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %20)
  ret i32 %21
}
> bin/lli test.ll
> echo $?
33

But the correct exit code should be 161 (run with lli on x86/https://github.com/dtcxzyw/llvm-ub-aware-interpreter):

; llubi test.ll --verbose
Entering function main
  %0 = load i32, ptr @f, align 4 -> i32 5
  %.fr = freeze i32 %0 -> i32 5
  %1 = insertelement <4 x i32> poison, i32 %.fr, i64 0 -> { i32 5, poison, poison, poison }
  %2 = shufflevector <4 x i32> %1, <4 x i32> poison, <4 x i32> zeroinitializer -> { i32 5, i32 5, i32 5, i32 5 }
  %3 = icmp ugt <4 x i32> zeroinitializer, %2 -> { F, F, F, F }
  %4 = select <4 x i1> %3, <4 x i32> splat (i32 1), <4 x i32> %2 -> { i32 5, i32 5, i32 5, i32 5 }
  %5 = icmp samesign ugt <4 x i32> %4, splat (i32 31) -> { F, F, F, F }
  %6 = lshr <4 x i32> splat (i32 2147483647), %4 -> { i32 67108863, i32 67108863, i32 67108863, i32 67108863 }
  %7 = icmp samesign ugt <4 x i32> %2, %6 -> { F, F, F, F }
  %8 = select <4 x i1> %5, <4 x i1> splat (i1 true), <4 x i1> %7 -> { F, F, F, F }
  %9 = select <4 x i1> %8, <4 x i32> zeroinitializer, <4 x i32> %4 -> { i32 5, i32 5, i32 5, i32 5 }
  %10 = shl <4 x i32> %2, %9 -> { i32 160, i32 160, i32 160, i32 160 }
  %11 = xor <4 x i32> %10, splat (i32 1) -> { i32 161, i32 161, i32 161, i32 161 }
  %12 = tail call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %11) -> i32 161
  ret i32 %12
Exiting function main

llvm version: edfa75d
CPU: Loongson-3A5000

I will post a fix later.

@dtcxzyw dtcxzyw self-assigned this Nov 13, 2024
swatheesh-mcw pushed a commit to swatheesh-mcw/llvm-project that referenced this issue Nov 20, 2024
…hen there are predicate calls (llvm#116075)

On loongarch64 with lsx extension, we select `VBITREV_W` for `v4i32 (xor
X, (shl splat(1), Y))`:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td#L1583-L1584

And `vsplat_imm_eq_1` is defined as:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td#L77-L87

For the `(bitconvert (v4i32 (build_vector)))` case, the pattern is
expected to be:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (bitconvert:{ *:[v4i32] } (build_vector:{ *:[v4i32] }))<<P:Predicate_vsplat_imm_eq_1>>, v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

However, `simplifyTree` drops the `bitconvert` node and its predicates:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp#L3036-L3062

Then llvm will match `vsplat_imm_eq_1` for any v4i32 splats and cause a
miscompilation:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (build_vector:{ *:[v4i32] }), v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

This patch adds additional checks for predicates associated with the
trivial bitconvert node. Unused patterns in the LoongArch target are
also removed.

Fixes llvm#116008.
tru pushed a commit to llvmbot/llvm-project that referenced this issue Nov 25, 2024
…hen there are predicate calls (llvm#116075)

On loongarch64 with lsx extension, we select `VBITREV_W` for `v4i32 (xor
X, (shl splat(1), Y))`:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td#L1583-L1584

And `vsplat_imm_eq_1` is defined as:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td#L77-L87

For the `(bitconvert (v4i32 (build_vector)))` case, the pattern is
expected to be:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (bitconvert:{ *:[v4i32] } (build_vector:{ *:[v4i32] }))<<P:Predicate_vsplat_imm_eq_1>>, v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

However, `simplifyTree` drops the `bitconvert` node and its predicates:

https://github.com/llvm/llvm-project/blob/8e6630391699116641cf390a10476295b7d4b95c/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp#L3036-L3062

Then llvm will match `vsplat_imm_eq_1` for any v4i32 splats and cause a
miscompilation:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (build_vector:{ *:[v4i32] }), v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

This patch adds additional checks for predicates associated with the
trivial bitconvert node. Unused patterns in the LoongArch target are
also removed.

Fixes llvm#116008.

(cherry picked from commit c727b48)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant