Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo][RemoveDIs] Find types hidden in DbgRecords #106547

Merged
merged 10 commits into from
Sep 2, 2024

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Aug 29, 2024

When serialising to textual IR, there can be constant Values referred to by DbgRecords that don't appear anywhere else, and have types hidden even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.

When serialising to textual IR, there can be constant Values referred to by
DbgRecords that don't appear anywhere else, and have types hidden even
deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.
@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2024

@llvm/pr-subscribers-llvm-ir

Author: Jeremy Morse (jmorse)

Changes

When serialising to textual IR, there can be constant Values referred to by DbgRecords that don't appear anywhere else, and have types hidden even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.


Full diff: https://github.com/llvm/llvm-project/pull/106547.diff

2 Files Affected:

  • (modified) llvm/lib/IR/TypeFinder.cpp (+11)
  • (added) llvm/test/DebugInfo/type-finder-w-dbg-records.ll (+43)
diff --git a/llvm/lib/IR/TypeFinder.cpp b/llvm/lib/IR/TypeFinder.cpp
index 003155a4af4877..d2ffe1ba77b4e3 100644
--- a/llvm/lib/IR/TypeFinder.cpp
+++ b/llvm/lib/IR/TypeFinder.cpp
@@ -88,6 +88,17 @@ void TypeFinder::run(const Module &M, bool onlyNamed) {
         for (const auto &MD : MDForInst)
           incorporateMDNode(MD.second);
         MDForInst.clear();
+
+        // Incorporate types hiding in variable-location information.
+        for (const auto &Dbg : I.getDbgRecordRange()) {
+          // Pick out records that have Values.
+          if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
+            continue;
+          const DbgVariableRecord &DVI = static_cast<const DbgVariableRecord &>(Dbg);
+          for (Value *V : DVI.location_ops()) {
+            incorporateValue(V);
+          }
+        }
       }
   }
 
diff --git a/llvm/test/DebugInfo/type-finder-w-dbg-records.ll b/llvm/test/DebugInfo/type-finder-w-dbg-records.ll
new file mode 100644
index 00000000000000..d2daabc40a3ac3
--- /dev/null
+++ b/llvm/test/DebugInfo/type-finder-w-dbg-records.ll
@@ -0,0 +1,43 @@
+; RUN: opt --passes=verify %s -o - -S | FileCheck %s
+
+;; Test that the type definitions are discovered when serialising to LLVM-IR,
+;; even if they're only present inside a DbgRecord, and thus not normally
+;; visible.
+
+; CHECK: %union.anon = type { %struct.a }
+; CHECK: %struct.a = type { i32 }
+
+; ModuleID = 'bbi-98372.ll'
+source_filename = "bbi-98372.ll"
+
+%union.anon = type { %struct.a }
+%struct.a = type { i32 }
+
+@d = global [1 x { i16, i16 }] [{ i16, i16 } { i16 0, i16 undef }], align 1
+
+define void @e() {
+entry:
+    #dbg_value(ptr getelementptr inbounds ([1 x %union.anon], ptr @d, i32 0, i32 3), !7, !DIExpression(), !14)
+  ret void, !dbg !15
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5}
+!llvm.ident = !{!6}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "foo.c", directory: "/bar")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 1}
+!5 = !{i32 7, !"frame-pointer", i32 2}
+!6 = !{!"clang"}
+!7 = !DILocalVariable(name: "f", scope: !8, file: !1, line: 8, type: !12)
+!8 = distinct !DISubprogram(name: "e", scope: !1, file: !1, line: 8, type: !9, scopeLine: 8, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !11)
+!9 = !DISubroutineType(types: !10)
+!10 = !{null}
+!11 = !{!7}
+!12 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !13, size: 16)
+!13 = !DIBasicType(name: "int", size: 16, encoding: DW_ATE_signed)
+!14 = !DILocation(line: 0, scope: !8)
+!15 = !DILocation(line: 8, column: 28, scope: !8)

@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2024

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

When serialising to textual IR, there can be constant Values referred to by DbgRecords that don't appear anywhere else, and have types hidden even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.


Full diff: https://github.com/llvm/llvm-project/pull/106547.diff

2 Files Affected:

  • (modified) llvm/lib/IR/TypeFinder.cpp (+11)
  • (added) llvm/test/DebugInfo/type-finder-w-dbg-records.ll (+43)
diff --git a/llvm/lib/IR/TypeFinder.cpp b/llvm/lib/IR/TypeFinder.cpp
index 003155a4af4877..d2ffe1ba77b4e3 100644
--- a/llvm/lib/IR/TypeFinder.cpp
+++ b/llvm/lib/IR/TypeFinder.cpp
@@ -88,6 +88,17 @@ void TypeFinder::run(const Module &M, bool onlyNamed) {
         for (const auto &MD : MDForInst)
           incorporateMDNode(MD.second);
         MDForInst.clear();
+
+        // Incorporate types hiding in variable-location information.
+        for (const auto &Dbg : I.getDbgRecordRange()) {
+          // Pick out records that have Values.
+          if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
+            continue;
+          const DbgVariableRecord &DVI = static_cast<const DbgVariableRecord &>(Dbg);
+          for (Value *V : DVI.location_ops()) {
+            incorporateValue(V);
+          }
+        }
       }
   }
 
diff --git a/llvm/test/DebugInfo/type-finder-w-dbg-records.ll b/llvm/test/DebugInfo/type-finder-w-dbg-records.ll
new file mode 100644
index 00000000000000..d2daabc40a3ac3
--- /dev/null
+++ b/llvm/test/DebugInfo/type-finder-w-dbg-records.ll
@@ -0,0 +1,43 @@
+; RUN: opt --passes=verify %s -o - -S | FileCheck %s
+
+;; Test that the type definitions are discovered when serialising to LLVM-IR,
+;; even if they're only present inside a DbgRecord, and thus not normally
+;; visible.
+
+; CHECK: %union.anon = type { %struct.a }
+; CHECK: %struct.a = type { i32 }
+
+; ModuleID = 'bbi-98372.ll'
+source_filename = "bbi-98372.ll"
+
+%union.anon = type { %struct.a }
+%struct.a = type { i32 }
+
+@d = global [1 x { i16, i16 }] [{ i16, i16 } { i16 0, i16 undef }], align 1
+
+define void @e() {
+entry:
+    #dbg_value(ptr getelementptr inbounds ([1 x %union.anon], ptr @d, i32 0, i32 3), !7, !DIExpression(), !14)
+  ret void, !dbg !15
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5}
+!llvm.ident = !{!6}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "foo.c", directory: "/bar")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 1}
+!5 = !{i32 7, !"frame-pointer", i32 2}
+!6 = !{!"clang"}
+!7 = !DILocalVariable(name: "f", scope: !8, file: !1, line: 8, type: !12)
+!8 = distinct !DISubprogram(name: "e", scope: !1, file: !1, line: 8, type: !9, scopeLine: 8, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !11)
+!9 = !DISubroutineType(types: !10)
+!10 = !{null}
+!11 = !{!7}
+!12 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !13, size: 16)
+!13 = !DIBasicType(name: "int", size: 16, encoding: DW_ATE_signed)
+!14 = !DILocation(line: 0, scope: !8)
+!15 = !DILocation(line: 8, column: 28, scope: !8)

Copy link

github-actions bot commented Aug 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 94 to 100
// Pick out records that have Values.
if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
continue;
const DbgVariableRecord &DVI = static_cast<const DbgVariableRecord &>(Dbg);
for (Value *V : DVI.location_ops()) {
incorporateValue(V);
}
Copy link
Contributor

@OCHyams OCHyams Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Pick out records that have Values.
if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
continue;
const DbgVariableRecord &DVI = static_cast<const DbgVariableRecord &>(Dbg);
for (Value *V : DVI.location_ops()) {
incorporateValue(V);
}
if (const DbgVariableRecord*DVI = dyn_cast<DbgVariableRecord>(&Dbg)) {
for (Value *V : DVI->location_ops())
incorporateValue(V);
if (Value *Addr = DVI->getAddress(); Addr && DVI->isDbgAssign())
incorperateValue(Addr);
}

Better to use the dyn/isa casting rather than checking the subclass discriminator directly (if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind))?

I think we also need to add dbg_assign address component values too (there's no operand iterator that includes that sadly).

I've incorporated both those fixes in the suggestion.


define void @e() {
entry:
#dbg_value(ptr getelementptr inbounds ([1 x %union.anon], ptr @d, i32 0, i32 3), !7, !DIExpression(), !14)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly worth adding dbg_assign and dbg_declare for completeness? (dbg_assign more important here imo, given the extra fields)

Comment on lines 95 to 98
if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
continue;
const DbgVariableRecord &DVI =
static_cast<const DbgVariableRecord &>(Dbg);
Copy link
Contributor

@nikic nikic Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer dyn_cast?

Suggested change
if (Dbg.getRecordKind() != DbgRecord::Kind::ValueKind)
continue;
const DbgVariableRecord &DVI =
static_cast<const DbgVariableRecord &>(Dbg);
if (auto *DVI = dyn_cast<DbgVariableRecord>(Dbg)) {

@jmorse
Copy link
Member Author

jmorse commented Aug 29, 2024

Switched to the dyn_cast approach and added a #dbg_assign test. We'll want an iterator in the future that does this for us (give us all the values in some DbgRecord), but not today.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want an iterator in the future that does this for us (give us all the values in some DbgRecord), but not today.

+1
I think DbgRecords currently reflect the (also poor) interface we use in intrinsics, but it's probably it's more painful for records because a number of sites that would iterate over intrinsic arguments (and thus find all values) now need to do the kind of thing you've done here.

LGTM. @nikic do you have any further comments?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from formatting...

@jmorse
Copy link
Member Author

jmorse commented Aug 30, 2024

@OCHyams see latest commit, some twiddling is required as the Address field of a #dbg_assign might have gone to null. I can't remember the exact sequence of events that causes this, but there's precedent with DbgVariableRecord::location_ops. Possibly this is because the code+comments have been brought over from the intrinsic-format dbg.assign implementation, but with debug-records the DbgRecord is the ValueAsMetadata?

Either way, could you confirm this is alright.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Comment on lines 99 to 100
if (Value *Addr = DVI->getAddress(); Addr && DVI->isDbgAssign())
incorporateValue(Addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the latest change to check allow for DIArgList in getAddress: it's not permitted for dbg_declares or dbg_assigns address components to use DIArgList. getAddress returns the first DebugValue operand unless it's a dbg_assign - then it returns the second. The only reason we're seeing an DIArgList pop up in getAddress is because its being called before checking if DVI->isDbgAssign(): some dbg_value with a DIArgList is having getAddress called on it. I thought this would be fine when I wrote the suggestion above but clearly it breaks assumptions.

I think it's probably better to restructure this to:

if (DVI->isDbgAssign())) {
  if (Value *Addr = DVI->getAddress())
    incorporateValue(Addr);
}

And revert the most recent getAddress change- wdyt? Sorry for the additional hassle.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sorry for the additional faff

@jmorse jmorse merged commit 25f87f2 into llvm:main Sep 2, 2024
6 of 8 checks passed
@mikaelholmen
Copy link
Collaborator

Thanks @jmorse !

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Sep 3, 2024
When serialising to textual IR, there can be constant Values referred to
by DbgRecords that don't appear anywhere else, and have types hidden
even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.

(cherry picked from commit 25f87f2)
tru pushed a commit to llvmbot/llvm-project that referenced this pull request Sep 3, 2024
When serialising to textual IR, there can be constant Values referred to
by DbgRecords that don't appear anywhere else, and have types hidden
even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.

(cherry picked from commit 25f87f2)
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Mar 26, 2025
When serialising to textual IR, there can be constant Values referred to
by DbgRecords that don't appear anywhere else, and have types hidden
even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants