Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RemoveDIs] Fix spliceDebugInfo splice-to-end edge case #105671

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Aug 22, 2024

Fix #105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.


Note, this is based on #105670 - please ignore the first commit in this PR (I'll rebase once that one goes in). (rebased)

@OCHyams OCHyams requested review from jmorse and SLTozer August 22, 2024 14:31
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-ir

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

Fix #105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.


Note, this is based on #105670 - please ignore the first commit in this PR (I'll rebase once that one goes in).


Full diff: https://github.com/llvm/llvm-project/pull/105671.diff

2 Files Affected:

  • (modified) llvm/lib/IR/BasicBlock.cpp (+20-16)
  • (modified) llvm/unittests/IR/BasicBlockDbgInfoTest.cpp (+56-2)
diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index cf05b11c53963c..39cefa5280c62f 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -961,9 +961,13 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src,
   // Detach the marker at Dest -- this lets us move the "====" DbgRecords
   // around.
   DbgMarker *DestMarker = nullptr;
-  if (Dest != end()) {
-    if ((DestMarker = getMarker(Dest)))
+  if ((DestMarker = getMarker(Dest))) {
+    if (Dest == end()) {
+      assert(DestMarker == getTrailingDbgRecords());
+      deleteTrailingDbgRecords();
+    } else {
       DestMarker->removeFromParent();
+    }
   }
 
   // If we're moving the tail range of DbgRecords (":::"), absorb them into the
@@ -971,8 +975,16 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src,
   if (ReadFromTail && Src->getMarker(Last)) {
     DbgMarker *FromLast = Src->getMarker(Last);
     if (LastIsEnd) {
-      Dest->adoptDbgRecords(Src, Last, true);
-      // adoptDbgRecords will release any trailers.
+      if (Dest == end()) {
+        // Abosrb the trailing markers from Src.
+        assert(FromLast == Src->getTrailingDbgRecords());
+        createMarker(Dest)->absorbDebugValues(*FromLast, true);
+        FromLast->eraseFromParent();
+        Src->deleteTrailingDbgRecords();
+      } else {
+        // adoptDbgRecords will release any trailers.
+        Dest->adoptDbgRecords(Src, Last, true);
+      }
       assert(!Src->getTrailingDbgRecords());
     } else {
       // FIXME: can we use adoptDbgRecords here to reduce allocations?
@@ -1005,22 +1017,14 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src,
     } else {
       // Insert them right at the start of the range we moved, ahead of First
       // and the "++++" DbgRecords.
+      // This also covers the rare circumstance where we insert at end(), and we
+      // did not generate the iterator with begin() / getFirstInsertionPt(),
+      // meaning any trailing debug-info at the end of the block would
+      // "normally" have been pushed in front of "First". We move it there now.
       DbgMarker *FirstMarker = createMarker(First);
       FirstMarker->absorbDebugValues(*DestMarker, true);
     }
     DestMarker->eraseFromParent();
-  } else if (Dest == end() && !InsertAtHead) {
-    // In the rare circumstance where we insert at end(), and we did not
-    // generate the iterator with begin() / getFirstInsertionPt(), it means
-    // any trailing debug-info at the end of the block would "normally" have
-    // been pushed in front of "First". Move it there now.
-    DbgMarker *TrailingDbgRecords = getTrailingDbgRecords();
-    if (TrailingDbgRecords) {
-      DbgMarker *FirstMarker = createMarker(First);
-      FirstMarker->absorbDebugValues(*TrailingDbgRecords, true);
-      TrailingDbgRecords->eraseFromParent();
-      deleteTrailingDbgRecords();
-    }
   }
 }
 
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index 91a0745a0cc76e..5d62ce6f2c875d 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -141,11 +141,11 @@ TEST(BasicBlockDbgInfoTest, SplitBasicBlockBefore) {
   Function *F = M->getFunction("func");
 
   BasicBlock &BB = F->getEntryBlock();
-  auto I = std::prev(BB.end(), 2);
+  auto I = std::prev(BB.end(), 2); // store i32 2, ptr %1.
   BB.splitBasicBlockBefore(I, "before");
 
   BasicBlock &BBBefore = F->getEntryBlock();
-  auto I2 = std::prev(BBBefore.end(), 2);
+  auto I2 = std::prev(BBBefore.end()); // br label %1 (new).
   ASSERT_TRUE(I2->hasDbgRecords());
 }
 
@@ -1525,4 +1525,58 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) {
   EXPECT_FALSE(Ret->hasDbgRecords());
 }
 
+TEST(BasicBlockDbgInfoTest, DbgKnownSentinelCrash) {
+  return;
+  LLVMContext C;
+  std::unique_ptr<Module> M = parseIR(C, R"(
+    define i16 @foo(i16 %a) !dbg !6 {
+    entry:
+      %b = add i16 %a, 0
+        #dbg_value(i16 %b, !9, !DIExpression(), !11)
+      ret i16 0, !dbg !11
+    }
+
+    !llvm.dbg.cu = !{!0}
+    !llvm.module.flags = !{!5}
+
+    !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
+    !1 = !DIFile(filename: "t.ll", directory: "/")
+    !2 = !{}
+    !5 = !{i32 2, !"Debug Info Version", i32 3}
+    !6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !8)
+    !7 = !DISubroutineType(types: !2)
+    !8 = !{!9}
+    !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+    !10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned)
+    !11 = !DILocation(line: 1, column: 1, scope: !6)
+)");
+  ASSERT_TRUE(M);
+
+  Function *F = M->getFunction("foo");
+  BasicBlock &BB = F->getEntryBlock();
+  // Start with no trailing records.
+  ASSERT_FALSE(BB.getTrailingDbgRecords());
+
+  BasicBlock::iterator Ret = std::prev(BB.end());
+  BasicBlock::iterator B = std::prev(Ret);
+
+  // Delete terminator which has debug records: we now get trailing records.
+  Ret->eraseFromParent();
+  EXPECT_TRUE(BB.getTrailingDbgRecords());
+
+  BasicBlock *NewBB = BasicBlock::Create(C, "NewBB", F);
+  NewBB->splice(NewBB->end(), &BB, B, BB.end());
+
+  // The trailing records should've been absorbed into NewBB.
+  EXPECT_FALSE(BB.getTrailingDbgRecords());
+  EXPECT_TRUE(NewBB->getTrailingDbgRecords());
+  if (NewBB->getTrailingDbgRecords())
+    EXPECT_EQ(
+        llvm::range_size(NewBB->getTrailingDbgRecords()->getDbgRecordRange()),
+        1);
+
+  // Drop the trailing records now, to prevent a cleanup assertion.
+  NewBB->deleteTrailingDbgRecords();
+}
+
 } // End anonymous namespace.

Copy link
Contributor

@kartcq kartcq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this @OCHyams

Could you please clarify couple of things here.

  1. In the splice function of test
    NewBB->splice(NewBB->end(), &BB, B, BB.end());
    The B here refers to first instruction
    %b = add i16 %a, 0
    Is that what we are testing? Shouldn't we split at the return instruction ?

  2. Consider a scenario in a pass we want to split from return instruction. And the debug record just above belongs to that return instruction. When we try to splice here, can we expect the DebugInfo to tag along with the return instruction
    (or)
    Is it responsibility of pass owners using splice to set HeadBit of the instruction(in case if it has DebugRecords). So that the debug record will be tagged along with return instruction.

@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 22, 2024

Thanks for looking into this @OCHyams

Could you please clarify couple of things here.

  1. In the splice function of test
    NewBB->splice(NewBB->end(), &BB, B, BB.end());
    The B here refers to first instruction
    %b = add i16 %a, 0
    Is that what we are testing? Shouldn't we split at the return instruction ?

A few lines before the splice we delete the return instruction. That forces the debug records attached to it (which come "before" the instruction) to become "trailing" - meaning they come before end() (we can't attached the records to end() though, so they're associated with the basic block instead). The splice from B to end() includes exactly B and the one trailing debug record.

  1. Consider a scenario in a pass we want to split from return instruction. And the debug record just above belongs to that return instruction. When we try to splice here, can we expect the DebugInfo to tag along with the return instruction
    (or)
    Is it responsibility of pass owners using splice to set HeadBit of the instruction(in case if it has DebugRecords). So that the debug record will be tagged along with return instruction.

Hmm good question. If you're splitting before the ret then I believe the debug records attached to (positioned before) the ret will be separated from it into the new split block, unless HeadBit is set on ret's iterator before the split (as that would then indicate the split point comes "before" the debug intrinsics). HeadBit will be set automatically you've obtained the iterator to ret by calling BasicBlock::begin() or BasicBlock::getFirstNonPHIIt() (e.g., you call begin and the ret is the only instruction), otherwise I believe you're right that you'd need to set it yourself if you're sure you want those records to move.

In most cases that heuristic (set HeadBit from those functions) has worked out, but there are a couple of places in LLVM that explicitly set HeadBit - I can see one such example is actually a block split in CodeGenPrepare.

Copy link
Contributor

@jonathonpenix jonathonpenix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @OCHyams for fixing this!

@@ -1525,4 +1525,58 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) {
EXPECT_FALSE(Ret->hasDbgRecords());
}

TEST(BasicBlockDbgInfoTest, DbgKnownSentinelCrash) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the early return here be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yep, will do

Fix llvm#105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.
@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 29, 2024

rebased / ping

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this!

@OCHyams OCHyams merged commit 43661a1 into llvm:main Aug 29, 2024
5 of 8 checks passed
@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 30, 2024

/cherry-pick 43661a1

@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

/cherry-pick 43661a1

Error: Command failed due to missing milestone.

@OCHyams OCHyams added this to the LLVM 19.X Release milestone Aug 30, 2024
@OCHyams
Copy link
Contributor Author

OCHyams commented Aug 30, 2024

/cherry-pick 43661a1

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 30, 2024
Fix llvm#105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.

(cherry picked from commit 43661a1)
@llvmbot
Copy link
Member

llvmbot commented Aug 30, 2024

/pull-request #106691

@vitalybuka
Copy link
Collaborator

There is a memory leak,
not sure if in test on patched code https://lab.llvm.org/buildbot/#/builders/169/builds/2622/steps/9/logs/stdio

@vitalybuka
Copy link
Collaborator

Assuming that the crash less desirable than a leak I'll suppress a leak in the test and reopen the issue to avoid revert.

OCHyams added a commit to OCHyams/llvm-project that referenced this pull request Sep 2, 2024
Fix llvm#105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.

(cherry picked from commit 43661a1)
tru pushed a commit to OCHyams/llvm-project that referenced this pull request Sep 10, 2024
Fix llvm#105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.

(cherry picked from commit 43661a1)
@OCHyams OCHyams deleted the fix-105571-2 branch October 2, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

[DebugInfo] Assertion failure `!NodePtr->isKnownSentinel()' when fixing up debug info
7 participants