-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of .debug_names lookups when DW_IDX_parent attributes are used #91808
Conversation
@llvm/pr-subscribers-lldb Author: Greg Clayton (clayborg) ChangesWhen a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example:
"std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources. This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute. Full diff: https://github.com/llvm/llvm-project/pull/91808.diff 1 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 4da0d56fdcacb..f54e8071d97b6 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -83,6 +83,10 @@ bool DebugNamesDWARFIndex::ProcessEntry(
DWARFDIE die = dwarf.GetDIE(*ref);
if (!die)
return true;
+ // Watch out for forward declarations that appear in the .debug_names tables
+ // only due to being there for a DW_IDX_parent.
+ // if (die.GetAttributeValueAsUnsigned(DW_AT_declaration, 0))
+ // return true;
return callback(die);
}
|
…utes are used. When a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example: 0x0090cdbf: DW_TAG_compile_unit DW_AT_producer ("clang version 15.0.7") DW_AT_language (DW_LANG_C_plus_plus_14) DW_AT_name ("UniqueInstance.cpp") 0x0090cdc7: DW_TAG_namespace DW_AT_name ("std") 0x0090cdd5: DW_TAG_class_type DW_AT_name ("ios_base") DW_AT_declaration (true) 0x0090cdd7: DW_TAG_class_type DW_AT_name ("Init") DW_AT_declaration (true) 0x0090cdda: DW_TAG_typedef DW_AT_type (0x0090ce4e "std::_Ios_Seekdir") DW_AT_name ("seekdir") DW_AT_decl_file (0x11) DW_AT_decl_line (479) 0x0090cde4: DW_TAG_typedef DW_AT_type (0x0090ce45 "std::_Ios_Openmode") DW_AT_name ("openmode") DW_AT_decl_file (0x11) DW_AT_decl_line (447) 0x0090cdee: DW_TAG_typedef DW_AT_type (0x0090ce3c "std::_Ios_Iostate") DW_AT_name ("iostate") DW_AT_decl_file (0x11) DW_AT_decl_line (416) 0x0090cdf8: NULL "std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources. This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute.
0b4bc90
to
0cc1be6
Compare
The change/explanation looks intuitive, but I remember having seen DIE entry with |
we should probably fix the underlying issue instead: #77696 |
No, it is ok for |
This is one fix that is needed. Quoted from an e-mail chain:
|
We still have binaries that are floating around for now that contain this issue and this was causing crashes. So it would be nice to fix this in LLDB for now and back this out after we have a stable and trustable .debug_names section? |
That's ok by me, but I still don't understand how the problem being solved here is related to IDX_parent. As I said in the email you quoted, we don't add links to parents that are forward declarations. When the parent chain of a DIE is built, if we see an Note that only |
Ok, minor correction: if there is no complete parent chain, it just defaults to the older implementation (which calls ProcessEntry). But my point still stands: I don't believe this is related to the presence of IDX_parent, if anything, the presence of complete IDX_parents prunes the number of calls to ProcessEntry (which was the point of the implementation in the first place)
|
What's an actual test case for this issue? The example given above doesn't look like it'd produce entries for the declaration of ios_base? Like here's something that looks equivalent, but is complete, and doesn't have a DW_IDX_parent on the nested typedef entry in the index: https://godbolt.org/z/efjbze3x1 it'd be good to have a reproducer for this to motivate the discussion... |
& FWIW, I think it is valid to include these declarations as entries, though not as named/index entries, per the spec:
So I think you could have a parent that's a declaration, represented as a nameless index entry. That'd allow faster comparisons when examining the entry for the named child that had such a parent - because you could potentially check that named entry's fully qualified name directly from the named entry and the parents - without needing to parse all the DIEs in the CU to walk and find the start of the parents. But I don't believe clang is producing such DWARF today. |
What is "nameless index entry"? I don't quite understand from the spec. |
I will see what I can do to repro with a minimal test case. But I do believe that this PR should go in as right now we have binaries with thousands of entries that are being found from the top down from a name entry and causes tons of bogus type resolving lookups to happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #77696 is justification enough to add this check, but I don't think we should be implicating DW_IDX_parent_entries until we know how those came about. Both me and David have tried to produce entries like you describe and failed (in fact I was not able to produce a DW_IDX_parent_entries referring to any kind of class type whatsoever (only to namespace entries)).
Even when this check goes in, I think it would be useful to get to the bottom of the DW_IDX_parent_entries issue, if nothing else, then because it may affect the decision of when/if to remove this workaround.
// Watch out for forward declarations that appear in the .debug_names tables | ||
// only due to being there for a DW_IDX_parent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Watch out for forward declarations that appear in the .debug_names tables | |
// only due to being there for a DW_IDX_parent. | |
// Clang erroneously emits index entries for declaration DIEs in case when the | |
// definition is in a type unit (llvm.org/pr77696). Weed those out. |
My interpretation of this paragraph is that we're allowed to insert extra "floating" entries into the debug_names table. Such entries must not be reachable from the hash or name tables, but they can be referred to via DW_IDX_parent_entry attributes of other (reachable) entries. Using Greg's example from #91808 (comment), I believe we could add a debug_names entry for |
This is pretty much my point, the workaround is fine, but I don't think the source-code comment in this PR is. The suggestion provided LGTM. My guess is that the examples Greg is encountering are just hitting the slowness of DWARF parsing through other code paths (anything that causes too many calls to "ProcessEntries", and IDX_parent addressed one of those, @Michael137 was just asking me the other day if we can address some of the other queries as well).
Yup, this is very doable!
When I compile the following with
|
I do get it. It was my bad, I had a leftover -fdebug-types-section flag in my test cmd from when I was experimenting with the debug types section. I guess that's another thing to chalk up to #77696 |
Using example above, with a fix by @dwblaikie I see:
Since clang no longer emits entry for:
Is this the right behavior?
It doesn't emit entry for this because there is no name attribute
|
I would say "yes", because the spec is pretty explicit about excluding DW_AT_declaration entries. I can see a case being made that DW_AT_signature should be treated the same way as DW_AT_specification and DW_AT_abstract_origin (i.e., transparently), but that's definitely not what the spec says right now.
This gets a bit fuzzy, I think. The spec appears to allow this behavior (In such a (IANAL YMMV) |
But what does the
We can discuss this, but I think the point is going to be moot given what I mentioned above. The debug_names section is reflecting the state of In the case where the declaration is there, debug_names will have correct info for
Why do you say this is stretching the meaning of parent? This looks fine to me, but it seems impossible to emit such debug_names section if the compiler is no longer emitting the declaration with the type signature. (we'd need to check if the emitter code has some way of finding the definition, but also how could it possibly know there is a type between any two Parent-Child nodes? It really feels like we can't just elide the definition). |
For "parent point to the definition in the type unit". That definition will be in a different type unit, correct? Is that allowed? For one entry DW_IDX_type_unit will have one index, and it's parent will have another index. There is also DWP scenario. Where we don't know which TU will be picked. So chain might be pointing to "discarded" TU from a different CU. So in this particular case what would the nameless entry look like? |
For clang I used example here with TOT clang. Which included this change: bd5c636
clang++ main.cpp -fuse-ld=lld -g2 -O0 -fdebug-types-section -gpubnames -o main.exe |
To elaborate, I suggested Zequan does this, because I think there's consensus that this is a good way to filter out (incorrect) declaration dies from the index, and it's a better (faster) fix than what Zequan had in his PR. It's still worthwhile to figure out where those entries are coming from. We know of one case with type units and that has now been fixed (thanks to David), if there are more, we should try to understand where they are coming from). |
…ing when parsing declaration DIEs. (llvm#92328) This reapplies llvm@9a7262c (llvm#90663) and added llvm#91808 as a fix. It was causing tests on macos to fail because `SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map owned by this symol file. When there were two symbol files, two different maps were created for caching from compiler type to DIE even if they are for the same module. The solution is to do the same as `SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so the map is shared among multiple SymbolFileDWARF.
Sounds good! |
thanks for well-written summary @dwblaikie ! |
Yes, although it wasn't clear to me how lldb would handle it. Thanks for elaborating. :) |
@felipepiovezan
This seems like a bug no?
In TU 0
I think it should be pointing to:
|
So I am trying to lookup So when I lookup The way I saw this was to enable DWARF logging in LLDB and if I did that I saw thousands of Thanks for the detailed explanations above everyone. I would vote to get this in to reduce churn when doing type lookups, even if we do fix the compiler, linker, and or other tools to not produce them in the future, we still have such binaries floating around from past compilers, linkers and other tools. |
Still would be nice to have a small repro to make sure clang, and BOLT, now does the right thing. |
One easy question would be: do you/your users use -fdebug-types-section? If so, that'd probably explain what you were seeing & you could add some test coverage for that wherever you like (in lldb, presumably, maybe in bolt too). But if you/they don't, then it's unclear where this bad index came from - and it'd be good to know more about the DWARF so we could figure out where else there might be problems like this... |
Yes -fdebug-types-section is the default for us. |
declaration. Summary: Previously when an entry was skipped in parent chain a child will point to the next valid entry in the chain. After discussion in llvm#91808 this is not very useful. Changed implemenation so that all the children of the entry that is skipped won't have DW_IDX_parent.
declaration. Summary: Previously when an entry was skipped in parent chain a child will point to the next valid entry in the chain. After discussion in llvm#91808 this is not very useful. Changed implemenation so that all the children of the entry that is skipped won't have DW_IDX_parent.
You are right. The fact that they have the same relative offset tells me that some part of the code is failing to account for TUs. I just checked the printing code in the hope that it was a mistake while dumping, but it doesn't seem to be... |
yeah not print tools, at assembly it points to wrong thing. :( Do you think you would be able to fix? :D |
I did some looking around on the CU/TU collisison and found/filed this: #93886 - should be more explicit/clearer |
This fix was committed as part of: commit 51dd4ea
|
PSA: I've reverted Zequan's patch due to other bugs, so this change is now uncommitted again. Since this could go in as a separate change, I'm going to recreate a patch for it tomorrow (running out of time today), including the fix from #94400, if someone doesn't beat me to it. I'm sorry about all the back and forth. |
This makes sure we try to process declaration DIEs that are erroneously present in the index. Until bd5c636, clang was emitting index entries for declaration DIEs with DW_AT_signature attributes. This makes sure to avoid returning those DIEs as the definitions of a type, but also makes sure to pass through DIEs referring to static constexpr member variables, which is a (probably nonconforming) extension used by dsymutil. It adds test cases for both of the scenarios. It is essentially a recommit of llvm#91808.
This makes sure we try to process declaration DIEs that are erroneously present in the index. Until bd5c636, clang was emitting index entries for declaration DIEs with DW_AT_signature attributes. This makes sure to avoid returning those DIEs as the definitions of a type, but also makes sure to pass through DIEs referring to static constexpr member variables, which is a (probably nonconforming) extension used by dsymutil. It adds test cases for both of the scenarios. It is essentially a recommit of #91808.
When a .debug_names table has entries that use the DW_IDX_parent attributes, we can end up with entries in the .debug_names table that are not full definitions. This is because a class that is foward declared, can contain types. For example:
"std::ios_base" is forward declared and it contains typedefs whose entries in the .debug_names table will point to the DIE at offset 0x0090cdd5. These entries cause our type lookups to try and parse a TON of forward declarations and waste time and resources.
This fix makes sure when/if we find an entry in the .debug_names table, we don't process it if it has a DW_AT_declaration(true) attribute.