Skip to content

[RegAllocFast] Ensure live-in vregs get reloaded after INLINEASM_BR spills #131350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

antoniofrighetto
Copy link
Contributor

We have already ensured in 9cec2b2 that INLINEASM_BR output operands get spilled onto the stack, both in the fallthrough path and in the indirect targets. Since reloads of live-ins values into physical registers contextually happens after all MIR instructions (and ops) have been visited, make sure such loads are placed at the start of the block, but after prologues or INLINEASM_BR spills, as otherwise this may cause stale values to be read from the stack.

Fixes: #74483, #110251.

@llvmbot
Copy link
Member

llvmbot commented Mar 14, 2025

@llvm/pr-subscribers-backend-x86

Author: Antonio Frighetto (antoniofrighetto)

Changes

We have already ensured in 9cec2b2 that INLINEASM_BR output operands get spilled onto the stack, both in the fallthrough path and in the indirect targets. Since reloads of live-ins values into physical registers contextually happens after all MIR instructions (and ops) have been visited, make sure such loads are placed at the start of the block, but after prologues or INLINEASM_BR spills, as otherwise this may cause stale values to be read from the stack.

Fixes: #74483, #110251.


Full diff: https://github.com/llvm/llvm-project/pull/131350.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/RegAllocFast.cpp (+20-2)
  • (added) llvm/test/CodeGen/X86/regallocfast-callbr-asm-spills-after-reload.mir (+68)
diff --git a/llvm/lib/CodeGen/RegAllocFast.cpp b/llvm/lib/CodeGen/RegAllocFast.cpp
index fb960711d4ae0..7ba362d82f26e 100644
--- a/llvm/lib/CodeGen/RegAllocFast.cpp
+++ b/llvm/lib/CodeGen/RegAllocFast.cpp
@@ -391,6 +391,8 @@ class RegAllocFastImpl {
   bool mayLiveOut(Register VirtReg);
   bool mayLiveIn(Register VirtReg);
 
+  bool isInlineAsmBrSpill(const MachineInstr &MI) const;
+
   void dumpState() const;
 };
 
@@ -491,6 +493,20 @@ static bool dominates(InstrPosIndexes &PosIndexes, const MachineInstr &A,
   return IndexA < IndexB;
 }
 
+bool RegAllocFastImpl::isInlineAsmBrSpill(const MachineInstr &MI) const {
+  int FI;
+  auto *MBB = MI.getParent();
+  if (MBB->isInlineAsmBrIndirectTarget() && TII->isStoreToStackSlot(MI, FI) &&
+      MFI->isSpillSlotObjectIndex(FI)) {
+    for (const auto &Op : MI.operands())
+      if (Op.isReg() && any_of(MBB->liveins(), [&](const auto &RegP) {
+            return Op.getReg() == RegP.PhysReg;
+          }))
+        return true;
+  }
+  return false;
+}
+
 /// Returns false if \p VirtReg is known to not live out of the current block.
 bool RegAllocFastImpl::mayLiveOut(Register VirtReg) {
   if (MayLiveAcrossBlocks.test(VirtReg.virtRegIndex())) {
@@ -648,8 +664,8 @@ MachineBasicBlock::iterator RegAllocFastImpl::getMBBBeginInsertionPoint(
       continue;
     }
 
-    // Most reloads should be inserted after prolog instructions.
-    if (!TII->isBasicBlockPrologue(*I))
+    // Skip prologues and inlineasm_br spills to place reloads afterwards.
+    if (!TII->isBasicBlockPrologue(*I) && !isInlineAsmBrSpill(*I))
       break;
 
     // However if a prolog instruction reads a register that needs to be
@@ -736,6 +752,8 @@ bool RegAllocFastImpl::displacePhysReg(MachineInstr &MI, MCRegister PhysReg) {
       assert(LRI != LiveVirtRegs.end() && "datastructures in sync");
       MachineBasicBlock::iterator ReloadBefore =
           std::next((MachineBasicBlock::iterator)MI.getIterator());
+      while (isInlineAsmBrSpill(*ReloadBefore))
+        ++ReloadBefore;
       reload(ReloadBefore, VirtReg, LRI->PhysReg);
 
       setPhysRegState(LRI->PhysReg, regFree);
diff --git a/llvm/test/CodeGen/X86/regallocfast-callbr-asm-spills-after-reload.mir b/llvm/test/CodeGen/X86/regallocfast-callbr-asm-spills-after-reload.mir
new file mode 100644
index 0000000000000..41aca5524b590
--- /dev/null
+++ b/llvm/test/CodeGen/X86/regallocfast-callbr-asm-spills-after-reload.mir
@@ -0,0 +1,68 @@
+# NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=x86_64-- -run-pass=regallocfast -o - %s | FileCheck %s
+# RUN: llc -mtriple=x86_64-- -passes=regallocfast -o - %s | FileCheck %s
+
+...
+---
+name:            callbr-asm-spills-after-reload
+alignment:       16
+tracksRegLiveness: true
+registers:
+  - { id: 0, class: gr64, preferred-register: '', flags: [  ] }
+  - { id: 1, class: gr32, preferred-register: '', flags: [  ] }
+  - { id: 2, class: gr64, preferred-register: '', flags: [  ] }
+liveins:
+  - { reg: '$rdi', virtual-reg: '%2' }
+frameInfo:
+  isFrameAddressTaken: false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    8
+  adjustsStack:    false
+  hasCalls:        false
+fixedStack:      []
+stack:
+  - { id: 0, type: default, offset: 0, size: 8, alignment: 8,
+      stack-id: default, callee-saved-register: '', callee-saved-restored: true }
+body:             |
+  bb.0.entry:
+    successors: %bb.1(0x40000000), %bb.3(0x40000000)
+    liveins: $rdi
+
+    %2:gr64 = COPY $rdi
+    %3:gr64 = COPY killed %2
+    MOV64mr %stack.0, 1, $noreg, 0, $noreg, %3 :: (store (s64))
+    %0:gr64 = MOV64rm %stack.0, 1, $noreg, 0, $noreg :: (dereferenceable load (s64))
+    %6:gr32 = MOV32rm %0, 1, $noreg, 0, $noreg :: (load (s32))
+    %5:gr32_norex2 = COPY %6
+    %4:gr32_norex2 = COPY %5
+    INLINEASM_BR &"  subl $$11, $0;  cmpl $$11, $0; je ${2:l};", 0 /* attdialect */, 2686986 /* regdef:GR32_NOREX2 */, def %4, 2147483657 /* reguse tiedto:$0 */, %4(tied-def 3), 13 /* imm */, %bb.3, 12 /* clobber */, implicit-def early-clobber $df, 12 /* clobber */, implicit-def early-clobber $fpsw, 12 /* clobber */, implicit-def early-clobber $eflags
+    %1:gr32 = COPY %4
+    JMP_1 %bb.1
+
+  bb.1:
+    successors: %bb.2(0x80000000)
+
+    ; CHECK:      $rax = MOV64rm %stack.3, 1, $noreg, 0, $noreg :: (load (s64) from %stack.3)
+    ; CHECK-NEXT: $ecx = MOV32rm %stack.1, 1, $noreg, 0, $noreg :: (load (s32) from %stack.1)
+    ; CHECK-NEXT: MOV32mr renamable $rax, 1, $noreg, 0, $noreg, renamable $ecx :: (store (s32))
+    MOV32mr %0, 1, $noreg, 0, $noreg, %1 :: (store (s32))
+
+  bb.2:
+    RET64
+
+  bb.3 (machine-block-address-taken, inlineasm-br-indirect-target):
+    successors: %bb.2(0x80000000)
+
+    ; FIXME: This is a miscompilation, as, despite spilling the value modified by the inlineasm_br,
+    ; the reload emitted still reads from an uninitialized stack slot.
+    ; CHECK:      MOV32mr %stack.2, 1, $noreg, 0, $noreg, $eax :: (store (s32) into %stack.2)
+    ; CHECK-NEXT: $ecx = MOV32rm %stack.2, 1, $noreg, 0, $noreg :: (load (s32) from %stack.2)
+    ; CHECK-NEXT: $rax = MOV64rm %stack.3, 1, $noreg, 0, $noreg :: (load (s64) from %stack.3)
+    ; CHECK-NEXT: MOV32mr renamable $rax, 1, $noreg, 0, $noreg, killed renamable $ecx :: (store (s32))
+    ; CHECK-NEXT: JMP_1 %bb.2
+    %7:gr32 = COPY %4
+    MOV32mr %0, 1, $noreg, 0, $noreg, %7 :: (store (s32))
+    JMP_1 %bb.2
+
+...

MFI->isSpillSlotObjectIndex(FI)) {
for (const auto &Op : MI.operands())
if (Op.isReg() && any_of(MBB->liveins(), [&](const auto &RegP) {
return Op.getReg() == RegP.PhysReg;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use MBB->isLiveIn(Op.getReg()) here instead of any_of?

@@ -391,6 +391,8 @@ class RegAllocFastImpl {
bool mayLiveOut(Register VirtReg);
bool mayLiveIn(Register VirtReg);

bool isInlineAsmBrSpill(const MachineInstr &MI) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. It needs a comment because at first "I was what is a inlineasm branch spill"
  2. This is a misnomer because unless I'm mistaken the instruction may not be spilling a value coming from an inlineasm branch. I would go with mayBeSpillFromInlineAsmBr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had been pondering a few ones, and eventually I shortened the name too much. Updated, thanks!

…pills

We have already ensured in 9cec2b2
that `INLINEASM_BR` output operands get spilled onto the stack, both
in the fallthrough path and in the indirect targets. Since reloads of
live-ins values into physical registers contextually happen after all
MIR instructions (and ops) have been visited, make sure such loads are
placed at the start of the block, but after prologues or `INLINEASM_BR`
spills, as otherwise this may cause stale values to be read from the
stack.

Fixes: llvm#74483, llvm#110251.
@antoniofrighetto antoniofrighetto force-pushed the feature/loads-spills-inlineasmbr branch from dba3d6e to ade2276 Compare March 24, 2025 08:20
@antoniofrighetto antoniofrighetto merged commit ade2276 into llvm:main Mar 24, 2025
6 of 10 checks passed
@nickdesaulniers
Copy link
Member

BIG BIG thanks for fixing this! Sorry I wasn't around for reviews...life has been hectic for me lately with the job change (and kid)!

@antoniofrighetto
Copy link
Contributor Author

BIG BIG thanks for fixing this! Sorry I wasn't around for reviews...life has been hectic for me lately with the job change (and kid)!

Thanks, and glad to hear that! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

asm goto miscompilation
4 participants