-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AArch64 with global isel miscompile #78477
Comments
@llvm/issue-subscribers-backend-aarch64 Author: Tanmay Tirpankar (tanmaytirpankar)
The following LLVM IR:
```
define i32 @f(ptr %0) {
%2 = load <2 x i32>, ptr %0, align 8
store <4 x i32> zeroinitializer, ptr %0, align 16
%3 = extractelement <2 x i32> %2, i64 0
ret i32 %3
}
```
with `SDAG` lowers to
```
../../llvm-clone/llvm/build-release/bin/llc -march=aarch64 foo.ll -o foo1.ll
f:
ldr d0, [x0]
mov x8, x0
str xzr, [x0, #8]
str xzr, [x8]
fmov w0, s0
ret
```
but with `-global-isel` it lowers to
```
../../llvm-clone/llvm/build-release/bin/llc -march=aarch64 -global-isel foo.ll -o foo1.ll
f:
mov x8, x0
str xzr, [x0]
and x0, xzr, #0xffffffff
str xzr, [x8, #8]
ret
```
The `SDAG` version saves the value from memory pointed to by register `x0` first to `d0` and then to `w0` and zeros the memory content. The `global-isel` version on the other hand does not save the value and simply zeros the memory content. The final state of `w0` differs in both versions causing the return value to differ.
cc @regehr |
I tried
|
I tried
|
Looks like the (G_EXTRACT_ELT (G_LOAD)) -> G_LOAD combine inserts the scalar load at the extract element and not where the G_LOAD was. So the load is no longer before the store. |
The |
G_EXTRACT_VECTOR_ELT into G_LOAD. Fixes llvm#78477
G_EXTRACT_VECTOR_ELT into G_LOAD. Fixes llvm#78477
The following LLVM IR:
with
SDAG
lowers tobut with
-global-isel
it lowers toThe
SDAG
version saves the value from memory pointed to by registerx0
first tod0
and then tow0
and zeros the memory content. Theglobal-isel
version on the other hand does not save the value and simply zeros the memory content. The final state ofw0
differs in both versions causing the return value to differ.cc @regehr
The text was updated successfully, but these errors were encountered: