-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RISCV64 miscompile at -O2/-O1 #80052
Comments
Found using 3a92b20. I just kicked off a fresh build of LLVM to confirm it's still on trunk. |
@llvm/issue-subscribers-backend-risc-v Author: Patrick O'Neill (patrick-rivos)
Testcase:
```c
int printf(const char *, ...);
int a[10000];
int b, c, d, e, f;
char g, h;
int j[100][100];
int m(long n) {
b = ((b >> 4) ^ (b ^ n));
b = ((b >> 4)) ^ a[(b ^ 0x0f) & 0x0F];
b = ((b >> 4)) ^ a[(b ^ 0x0f) & 0x0F];
b = ((b >> 4)) ^ a[(b ^ 0x0f) & 0x0F];
printf("%s %lX\n", &g, b ^ 0xFFFFFFFFUL);
return 0;
}
int main() {
int i = 0;
for (; c < 256; c++)
a[c] = c;
volatile int k = 3;
h = 0;
for (; (h <= 4); h += 1) {
int *l = &d;
*l = k;
k = 0;
}
for (; i < 6; i++)
for (; e < 1; e++) {
m(j[i][e]);
if (f)
printf("%d%d\n", i, e);
}
return m(d);
}
```
Commands:
```bash
> /scratch/tc-testing/llvm-jan-24/build/bin/clang -O2 red.c -o user-config-O2.out
> QEMU_CPU=rv64 /scratch/tc-testing/llvm-jan-24/build/bin/qemu-riscv64 user-config-O2.out
FFFFFFF0
FFFFFFFC
> /scratch/tc-testing/llvm-jan-24/build/bin/clang -O0 red.c -o user-config-O0.out
> QEMU_CPU=rv64 /scratch/tc-testing/llvm-jan-24/build/bin/qemu-riscv64 user-config-O0.out
FFFFFFF0
FFFFFFFF
```
This raw testcase differs at -O2, but after feeding it through bugpoint, I got a .bc file that differs at -01.
```bash
> /scratch/tc-testing/llvm-jan-24/build/bin/clang red.c -O2 -o user-config.bc -emit-llvm -c
> /scratch/tc-testing/llvm-jan-24/build/build-llvm-linux/bin/bugpoint user-config.bc --run-llc
> /scratch/tc-testing/llvm-jan-24/build/build-llvm-linux/bin/clang bugpoint-reduced-named-md.bc -O0 -o bugpoint-O0.out
> QEMU_CPU=rv64 /scratch/tc-testing/llvm-jan-24/build/bin/qemu-riscv64 bugpoint-O0.out
FFFFFFF0
FFFFFFFF
> /scratch/tc-testing/llvm-jan-24/build/build-llvm-linux/bin/clang bugpoint-reduced-named-md.bc -O1 -o bugpoint-O1.out
> QEMU_CPU=rv64 /scratch/tc-testing/llvm-jan-24/build/bin/qemu-riscv64 bugpoint-O1.out
FFFFFFF0
FFFFFFFC
```
[bugpoint.zip](https://github.com/llvm/llvm-project/files/14103541/bugpoint.zip)
|
opt-bisect-limit points at |
I think this is related to using X0 as a sink for a volatile load that we don't need the result from. We have |
AArch64 won't replace a destination with the zero register if the instruction uses a FrameIndex. I think that was to fix a different issue, but maybe it hides this issue for them. I commented out that code for AArch64, but still couldn't hit an issue because this same test case doesn't generate any spills for AArch64 so the stack coloring pass ended early without trying to remove dead stores. |
Here's the preprocessed unreduced testcase. It has a lot more going on so it might generate a spill for this case on AArch64. |
The two best fixes seem to be not deleting volatile stores or teaching the pass about constant physical registers. |
The stack slot coloring pass is concerned with optimizing spill slots. If any change is a pass is made over the function to remove stack stores that use the same register and stack slot as an immediately preceding load. The register check is too simple for constant registers like AArch64 and RISC-V's zero register. This register can be used as the result of a load if we want to discard the result, but still have the memory access performed. Like for a volatile or atomic load. If the code sees a load from the zero register followed by a store of the zero register at the same stack slot, the pass mistakenly believes the store isn't needed. Since the main stack coloring optimization is only concerned with spill slots, it seems reasonable that RemoveDeadStores should only be concerned with spills. Since we never generate a reload of x0, this avoids the issue seen by RISC-V. Test case concept is adapted from pr30821.mir from X86. That test had to be updated to mark the stack slot as a spill slot. Fixes llvm#80052.
…0242) The stack slot coloring pass is concerned with optimizing spill slots. If any change is a pass is made over the function to remove stack stores that use the same register and stack slot as an immediately preceding load. The register check is too simple for constant registers like AArch64 and RISC-V's zero register. This register can be used as the result of a load if we want to discard the result, but still have the memory access performed. Like for a volatile or atomic load. If the code sees a load from the zero register followed by a store of the zero register at the same stack slot, the pass mistakenly believes the store isn't needed. Since the main stack coloring optimization is only concerned with spill slots, it seems reasonable that RemoveDeadStores should only be concerned with spills. Since we never generate a reload of x0, this avoids the issue seen by RISC-V. Test case concept is adapted from pr30821.mir from X86. That test had to be updated to mark the stack slot as a spill slot. Fixes #80052.
…vm#80242) The stack slot coloring pass is concerned with optimizing spill slots. If any change is a pass is made over the function to remove stack stores that use the same register and stack slot as an immediately preceding load. The register check is too simple for constant registers like AArch64 and RISC-V's zero register. This register can be used as the result of a load if we want to discard the result, but still have the memory access performed. Like for a volatile or atomic load. If the code sees a load from the zero register followed by a store of the zero register at the same stack slot, the pass mistakenly believes the store isn't needed. Since the main stack coloring optimization is only concerned with spill slots, it seems reasonable that RemoveDeadStores should only be concerned with spills. Since we never generate a reload of x0, this avoids the issue seen by RISC-V. Test case concept is adapted from pr30821.mir from X86. That test had to be updated to mark the stack slot as a spill slot. Fixes llvm#80052.
Testcase:
Commands:
This raw testcase differs at -O2, but after feeding it through bugpoint, I got a .bc file that differs at -01.
bugpoint.zip
The text was updated successfully, but these errors were encountered: