-
Notifications
You must be signed in to change notification settings - Fork 13.3k
debuginfo: How to (ideally) represent reference and pointer types in DWARF #37504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No.
I like both solution (1) and (2), no clear preference. Beware that traits and slices aren't the only things that can be DSTd, you also have |
This seems reasonable to me. |
C seems to use a |
For fat pointers I was thinking On the whole I'd rather the output be much more explicit. That is, instead of determining whether something is a slice by checking its name, introduce
Thin pointers, probably not. Fat pointers, sure. Ideally it should be possible for gdb to create a trait object; though of course we're still a ways away from that. |
What does "DSTd" mean? |
I assume "turned into a dynamically sized type". |
Aha, thanks. For dynamically sized types, we should just use the standard DWARF stuff. If the length is known then |
What are semantics of |
DWARF doesn't go into great detail here, but basically the answer is "just like C++". However, in DWARF it is also normal to reuse tags for different things depending on the CU's language; and for Rust I think the obvious answer is that a |
That's a good point. In a way a regular slice is just a special case of a struct with a trailing So, we have this:
I think it would be nice if Also note there are is also a pointer variant for each of the above. If we have different tags for each of those, we would get quite a few tags:
This seems a bit excessive to me. We could also just have a |
To give an example of what the
In this particular form, we would still have to look at the target type to find out if we have a regular slice or a struct/enum with a trailing |
Keep in mind that every new syntax you invent means new things you have to teach tools. GDB and LLDB aren't the only debuginfo consumers out there. So I would strongly lean towards semantics that a C++ tool would understand, even if seems less ideal to your own taste. At least think about how an uninformed tool might interpret your proposed scheme, compared to the status quo. |
@cuviper Do you have examples of tools that use type information (as opposed to just line-tables)? |
One thing that might be problematic about using DWARF expressions for getting element count/vtable address are optimizations that can pick apart fat pointers, like SROA. For those it might be better to have plain member DIEs? Though I'm not exactly sure how a debugger would handle this: If the value of a variable is calculated via a number of DW_OP_bit_pieces, we will the debugger reconstruct the value before evaluating an expression that takes the value as input? |
And my involvement on those means I could also work on adding new Rust semantics to them. I honestly haven't looked closely yet how well they interpret Rust's current DWARF output. I just hope more generally that tools could Just Work as much as possible. :) |
Wouldn't it have to reconstruct it? I don't see what else would make any sense. |
@cuviper Yes, that's a good point. We'll want to strike a balance between not doing everything differently from everybody else and doing things in a way that are a good fit for Rust. Regarding |
The counterpoint here is stuff like the existing representation of Rust enum types, which requires significant decoding in the debugger. In fact some new cases were just implemented this week. This is one reason I think it's better to just add new tags, along with helper attributes to describe things more precisely. I do agree that reusing existing tags makes sense when possible. Maybe this part of the discussion would be improved if it were more specific. For instance, how would you propose handing the cases under discussion here?
Yes, gdb does this already. I implemented it (:-) when gcc added debuginfo for SRA. |
Yeah, both ADTs and DST are Rust-specific types that don't have a C++ analogue, and pretending to a C++ type for the sake of tooling will probably mean that the tools won't display the right thing anyway. |
I think we can achieve a lot of backwards compatibility (or easy portability) if we use the standard tags and attributes (like DW_TAG_member, DW_TAG_byte_size, etc) like everyone else does.
Can you elaborate on what you mean exactly? |
I think the request to be more specific was aimed at me. :) And... I'll have to think on it. But it sounds like we're all agreeing not to stray too far. It looks like the proposal for thin
On this point in particular, I don't think thin pointers need it, as @tromey said. I think it would be very helpful for fat pointers though, if nothing else just to raise a flag to the tools that it's abnormal. |
@tromey I found a message on gdb-patches which describes an ADA "unconstrained array" fat pointer. It's not the same layout as a Rust slice, but I think the same concepts could apply. What do you think of that representation? So a similar Rust
I suspect this will look more familiar to tools that already know VLAs. In any case, I think |
For this particular representation, I think the issue is that there's no obvious way to dynamically construct an instance. However, that's a reasonable thing to want to do. In fact right now gdb does it, though by baking in some knowledge of the Rust ABI -- but avoiding this is one of my goals. (Another important goal being winding up with something we can document and attempt to get into DWARF 6.) I've been giving this topic some thought tonight and I have a number of issues to raise, which in my mind generally point to the usefulness of adding new tags where needed; though naturally I value your insights. This is a bit unsorted it turns out. Maybe this isn't an ideal forum for this sort of discussion.
I think the current approach could be described as "keep it close-ish to C++ and hope the tools are ok". I found this pretty inadequate for gdb, and I suspect for lldb in the end the only answer will be a more full port. There are just too many differences and they are accumulating. |
I think it's also important to note that we won't be able to come up with an encoding that is just understood by existing tools. The current approach has the goal of not crashing existing tools while providing enough information for pretty printers to have some minimal functionality. I think we have reached the limits of this approach and we'll need to make breaking changes going forward anyway. I think we should just choose clean encodings that don't do anything fancy. That should help existing tools to add support with minimal effort. |
I've been thinking more about this, and have come around to see it's not so horrible for Rust to invent new syntax (tags/attrs) for things that are truly unique. However, I think we should avoid overloading existing constructs in surprising ways. Namely, (I don't know if the standard says anything about this, but I like having CAPS prefixes on non-standard extensions.) |
@cuviper It looks like DWARF information emitted by Rust still doesn't give any hints to distinguish between
Is this the only relevant issue / discussion or has there been some progress tracked elsewhere perhaps? |
I guess I can just use prefix of |
That's what gdb does and what I plan to do in lldb, at least in the short run. Longer term I think we should use DWARF tags to differentiate, as discussed here. |
@tromey wrote (#37504 (comment)):
@cuviper wrote (#37504 (comment)):
Three years later, I was looking through the DWARF5 spec in case there's anything potentially useful, and came across this Fortran example (page 320, "Figure D.13"):
There is also an earlier example in Appendix D that might be simpler in terms of Fortran features it describes, but is longer so I'm not going to paste it here. Overall, it looks like DWARF is designed to support fully dynamic multidimensional arrays and slices, which is more powerful than Rust needs. Given the DWARF5 spec, its examples, and the comments from years ago in this thread, I believe we may have a path forward if we choose to go down that route. The main problem I see, for handling slices like this (assuming LLVM and debuggers support the necessary features), is that And for |
This is a very old and long thread and it's been a while since I looked at details, but I'd like to point out that
it not entirely true, or at least, not any different from the situation in C / C++. Aside from references, Rust also has mutable and immutable variables, parameters and so on, just like C / C++ does. So when one says that Rust doesn't have And yet, even though C / C++ has comparable type semantics, it already has an established DWARF representation for these different types - by using the earlier mentioned "constifying newtypes". One thing that was brought up and still remains true is that for Rust such representation is potentially more wasteful, because immutable types in Rust are much more popular than in C / C++ due to the flipped defaults. This might still be true, but on the other hand DWARF representation is fairly compact, and it would be worth measuring first whether introducing a new attribute really saves any noticeable amount of space compared to a separate type ref (which is essentially just a type tag + a reference to the inner type). For now, it would be great to unblock this issue and implement at least the suboptimal-but-already-supported-in-most-tools representation for immutable vs mutable references, and then we can iterate on it in future PRs. |
@rustbot label -C-tracking-issue |
Currently, we represent thin references and pointers with
DW_TAG_pointer_type
DIEs and fat pointers (slices and trait objects) asDW_TAG_struct
DIEs with fields representing payload and metadata pointers. This is not ideal and with debuggers knowing about Rust, we can do better. The question is, what exactly do we want the representation for these kinds of types to look like.Some things seem pretty straightforward to me:
DW_TAG_reference_type
DIEs.DW_TAG_pointer_type
DIEs.But beyond that, there are some decisions to be made:
(1) How do we represent mutability?
The C++ version of DWARF represents a const pointer like
const char *
with three separate type entries:I think this is a bit verbose and I'm not sure it is entirely appropriate for Rust. Do we really have
const
andmut
types? That is, does Rust have the concept of amut i32
at the type level, for example? I mean there are mutable and immutable slots/memory locations and we have "mutable" and "shared" references, but those two things seem kind of different to me.As an alternative to using
DW_TAG_const_type
for representing mutability, we could re-use theDW_AT_mutable
attribute that is already defined in DWARF. In C++ DWARF it is used formutable
fields. We could use it for reference type and local variable DIEs:(2) How to represent fat pointers?
The pointer types in C/C++ DWARF don't have
DW_TAG_member
sub-DIEs, since they are always just values. Fat pointers in Rust are different: they have one field that is a pointer to the data, and another field that holds additional information, either the size of a slice or the pointer to a vtable. These need to be described somehow.I see a few options:
DW_TAG_pointer_type
orDW_TAG_reference_type
DIE with two fields that are described byDW_TAG_member
sub-DIEs, both having theDW_AT_artificial
attribute. @tromey once suggested for slices that the field entries have no name and the debugger determines which is which by the type (the size is always an integer type, the data is always a pointer type). This could also be extended for trait objects, since the data pointer will always be a pointer to a trait and the vtable-pointer will always be something else.DW_TAG_slice_type
DIE that follows the encoding above and borrow some other attributes for trait objects: aDW_AT_vtable_elem_location
attribute holds the offset of the vtable field within the fat-pointer value, and aDW_AT_object_pointer
attribute does the same for the data pointer. This is distinctly not how these attributes are used in a C++ context but it would be a nice fit, I think.DW_AT_object_pointer
indicating data pointer fieldAnother questions is: Should fat-pointers (and thin pointers too, maybe) have a
DW_AT_byte_size
attribute that specifies their size explicitly?cc @tromey, @Manishearth
See also #33073
The text was updated successfully, but these errors were encountered: