-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Rust attribute to support embedding debugger visualizers #3191
Conversation
I can totally see people wanting to derive natvis files using (proc) macros. |
I'm a bit wary of adding super platform-specific stuff to Cargo.toml (and this is coming from a windows user...) I think it would be useful to look at prior art on other platforms, to see if there's some common denominator there. It looks like natvis is also supported with GDB in vscode, although I don't know if there's any way to embed natvis files into the debug info in that case. |
Idk, quite often it’s one way for windows and one way for unix systems.
What matters is being able to have a better debug experience. If that means
we have to do two things to cover all platforms that’s not too arduous.
…On Sat, 6 Nov 2021 at 02:11, Diggory Blake ***@***.***> wrote:
I'm a bit wary of adding super platform-specific stuff to Cargo.toml (and
this is coming from a windows user...)
I think it would be useful to look at prior art on other platforms, to see
if there's some common denominator there.
It looks like natvis is also supported with GDB in vscode, although I
don't know if there's any way to embed natvis files into the debug info in
that case.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3191 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGEJCFCMCBXSK6HOSL3HZDUKSFE5ANCNFSM5HJ5SMPQ>
.
|
I agree with this as well. Other platforms do not have a way to embed this kind of debug information automatically. VSCode uses MIEngine which only has a subset of the Natvis framework ported to it. This applies natvis to types within the debugger but the natvis is not embedded into the debug info. This means having to find the natvis for a specific version of a crate while debugging to get the correct behavior. This would put onus on the Rust developer who is consuming said crate to find the correct version of natvis for the version of the crate they are using. I believe having this solution for Windows is a great first step since these platforms are extremely different in how they support debugger visualizations. |
So, I think it is a great idea to support natvis in an ergonomic way. Even though natvis is specific to certain platforms/debuggers and so is intrinsically not cross-platform, I think it is important for this RFC to consider other platforms so that we are sure that it is possible for crates to support multiple formats for different platforms, ideally using a single mechanism or very similar mechanisms (not the same visualisation format, but the same mechanism for Cargo to discover the visualisations and notify them to rustc, maybe). On a lower level, how much does rustc need to do? Is it literally just passing filenames to the linker? Or does it need to do some symbol (de-)mangling? Integration with debuginfo support? How exactly will natvis files be saved in the metadata? Finally, I much prefer the auto-discovery alternative. It seems much more inline with Cargo's 'convention over configuration' philosophy - we do not list source files, for example. |
Hi, @nrc. I'm one of the co-authors of this RFC. Can you be more specific about this request? The other visualization formats supported by other debuggers are quite different in how they are implemented, packaged, and used. This RFC doesn't prevent implementing support for those debugger visualizations. In fact, it proposes a filesystem schema specifically for avoiding collisions, by using If there is something more to do here, we are certainly open to that. To me, it appears that the main requirement is not to interfere with implementing support for other debug formats, and I think this proposal meets that requirement. Because different debugger visualizers are so different in how they work, I don't see much of an opportunity to "use a single mechanism or very similar mechanism". Again, I'm open to that, but I just don't see any reusability here.
Yes, that's pretty much all there is to it. The only complexity here lies in the requirement to propagate these filenames across the dependency graph. That is, when you invoke the linker, you need to pass the linker options for all crates in the dependency graph, not just the current crate. In my original prototype implementation, I did this purely in Cargo, because Cargo knows the entire dependency graph and so it could walk it and find all of the NatVis files. However, when I discussed this design with some folks on the Cargo channel on Zulip, they felt that this was not a good approach because it would not work correctly for build system that were directly driving We also favor auto-discovery. However, like many options in Cargo, I think the developer should still be able to override auto-discovery. |
Hi, so I'm personally not familiar with how any debuggers support visualisations, so it would be useful for me to have a short survey of how that works so that I can evaluate the similarities/differences myself. I assume that is also the case for others who want to evaluate this RFC too. If they are so different, then that would be enough. Since I don't know the other mechanisms, I'm making a complete strawman here, but suppose that GDB takes a python file as a visualisation input, then that suggests that the Cargo manifest syntax should be something like
(again, totally making up the keywords as I go along). In that case, I would suggest this RFC propose this syntax (without the gdb case, leaving that for future work), rather than having
I would say that not only must we not interfere, we should leave scope for supporting other formats in a clean way.
I'm not sure if it makes sense, but I hope the hypothetical example above makes clear the extent of reusability I'm imagining, it's not much of an iteration over the current state of the RFC.
That sounds fine. I think it would be nice to specify this in the RFC in a bit of detail so that we can evaluate the scope of changes being proposed
Do you have an example of something you are following with this design? My intuition is that natvis files should be treated like source files, for which there is no mechanism to list them (not sure about overriding auto-discovery for source files, there is a mechanism in Rust source, but I'm not aware of a mechanism in Cargo). Are there typically many natvis files in a project, or usually just one (or one per target)? My preference would be not to keep metadata in the src directory and thus have a separate directory for such files, but I guess I would feel less strongly if it were just one file. I'm not sure of any precedent for keeping metadata in src vs in a separate directory, but it would be good to know about any if there is any. |
This RFC initially suggested having the
I've explained at a fairly high level in the reference level explanation the scope of changes this RFC introduces in terms of new toml syntax and rustc flags as well as needing to make changes to the crate metadata. I will go into more detail in this section to show what specific changes need to be made and what fields need to be added to a CrateMetadata object.
There is typically one Natvis file per library i.e. one file for a static lib and a separate for a shared DLL. This helps to break down where Natvis visualizations are defined for types. In the case of a Rust crate, you could imagine each crate having a single Natvis file that defines the debugger visualizations for types within that crate.
There is no reason as to why this would need to be kept in the src directory. Natvis files can be stored anywhere but they must be published as part of the crate. This way other crates that consume said crate would be able to pass the |
@nrc I like your suggested TOML structure, of:
About the question of where to put the files, and how many files there are. For typical C++ libraries, there is often just a single NatVis file, but it certainly does happen that a library is large enough to warrant breaking NatVis into more than one file, for the benefit of developers organizing things. So my expectation is that your typical crate will have just one, or a small number, of NatVis files. When I wrote the first draft of this spec, I thought just having a Or, we could simply say that the scenario of having different NatVis for different units is overkill, and that the same set of NatVis files should be used for all units in a given crate. That way, we just put them into
I think that's a reasonable request, mainly because it will illustrate just how different they are. For GDB, it's a set of Python scripts (I think? might be wrong). For others (WinDbg), it's a native, compiled DLL that gets loaded into a debugger. WinDbg also supports JavaScript-based extension, too. Some of these can be linked into target binaries, or into their debug symbols (PDBs), but others have no convenient way to package the debugger extensions in a way that is discoverable by the debugger -- you just have to know how to find it. |
It sounds like it is worth digging in to this requirement. Is there an equivalent thing for C++/VS projects? I know some projects have extensive support for testing infrastructure, and that might warrant separate debug metadata. However, one could also use a separate crate in workspace for this. I can't imagine want separate visualisations for most tests? I can imagine a crate has both a library and binary targets and the latter has some data types (with visualisations) which are not in the library. Could this be handled with a single natvis file? Relatedly, if a crate has datatypes which are present or not depending on |
No, because in VS a single project produces a single output (EXE, DLL, LIB). There's no concept of "units". On balance, I think we should start with the simpler model: For a given crate, the user defines a set of NatVis files. Those NatVis files are compiled into the primary crate unit (exe or rlib), and into the unit test crate unit. For benchmarks, integration tests, and bins, we don't include the NatVis files, because those compilations already reference the primary crate, which already contains the NatVis files. I think this is best because:
I think, for this V1, the NatVis file would contain definitions for the union of all features. As long as features are additive (and they are supposed to be), that should work fine. It means that some definitions may be ignored or useless when their corresponding features are not activated. I think that's OK. When I originally drafted this RFC, I included two means to package NatVis into crates. The first is the simple "whole file" method, which is what is described here. The second was to allow users to include NatVis fragments directly into Rust source code, such as: natvis!(r#"
<Type Name="my_crate::MyFancyType">
<DisplayString> ... </DisplayString>
<Expand> ... </Expand>
...
</Type>
"#); Or, to provide it as an attribute on a type, so that the compiler could generate exactly the right type string, for matching descriptions to mangled types. This would free the developer from needing to write the #[natvis(xml = r#"
<DisplayString> ... </DisplayString>
<Expand> ... </Expand>
")]
pub struct MyFancyType { ... } Rustc would then find all of these and assemble them into a single XML document, and bundle that into the PDB. If we allow users to embed NatVis fragments directly into source code, then we can allow So our plan is, stabilize the V1 support, and then to submit PRs to various crates, so that we can improve the debug experience of those crates. Then, when the value of this becomes apparent, we (with the community) can concurrently be working on spec'ing V2, with direct source embedding. |
I posted one comment, but otherwise this looks good to me. I think it's completely reasonable to do this in two phases, with the first requiring separate files and the second allowing inline attributes that get extracted. |
Hi @joshtriplett thanks for taking the time to review this RFC. The goal of this RFC is to see stabilization of this feature at some point in time. As such, I was wondering if the flags I have suggested in the RFC are appropriate. There's a couple of ways this could be achieved.
Thoughts? |
@ridwanabdillahi I personally would propose starting with |
@joshtriplett that sounds good to me. I'll update the RFC to reflect as such. |
I believe I've responded to all of the comments, is there anything else that needs to be addressed in this RFC? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks for addressing all my earlier comments. I have a few more inline, but nothing huge.
The one 'big' thing is that I still think the "Auto-discover Natvis XML files" option is more idiomatic for Cargo then explicitly listing all the natvis files, but we can leave that up to Cargo team review.
In response to #3191 (comment) (and just general pondering). I wonder if it is worth thinking a bit about integration with Cargo features at least (and perhaps other sources of cfg input such as the target platform). I don't want to get too much into discussion of what v2 would look like, but I have a strong preference to avoid inlining the natvis info and keeping it in separate files. Given this possibility, I want to make sure there is scope in the proposal to extend it to including different natvis depending on features, etc. We don't have to go into too much detail, obviously. Could we imagine extending the Cargo manifest entry to include features, etc. in the same way as dependencies? I think it should be easy enough some how, and perhaps this is a good justification for keeping the explicit filename list rather than auto-discovery? |
One more thought, on toolchains which don't support adding natvis into the PDB during linking, should Cargo or rustc emit all the specified natvis files into a location in the target directory so that they can be easily accessed by the user? |
…ichaelwoerister Add support for embedding pretty printers via `#[debugger_visualizer]` attribute Initial support for [RFC 3191](rust-lang/rfcs#3191) in PR rust-lang#91779 was scoped to supporting embedding NatVis files using a new attribute. This PR implements the pretty printer support as stated in the RFC mentioned above. This change includes embedding pretty printers in the `.debug_gdb_scripts` just as the pretty printers for rustc are embedded today. Also added additional tests for embedded pretty printers. Additionally cleaned up error checking so all error checking is done up front regardless of the current target. RFC: rust-lang/rfcs#3191
…ichaelwoerister Add support for embedding pretty printers via `#[debugger_visualizer]` attribute Initial support for [RFC 3191](rust-lang/rfcs#3191) in PR rust-lang#91779 was scoped to supporting embedding NatVis files using a new attribute. This PR implements the pretty printer support as stated in the RFC mentioned above. This change includes embedding pretty printers in the `.debug_gdb_scripts` just as the pretty printers for rustc are embedded today. Also added additional tests for embedded pretty printers. Additionally cleaned up error checking so all error checking is done up front regardless of the current target. RFC: rust-lang/rfcs#3191
I've just become aware of this RFC and skimmed at least all of comments here. One thing that I don't see discussed in the thread (although it is mentioned in the RFC text itself) is why, specifically, we can't use
The idea of just using use regex::Regex;
fn find_c_defines(input: &str) {
let rx = Regex::new(r#"^#define\s+(\w+)\s+([0-9]+)\s*(//(.*))?"#).unwrap();
for captures in rx.captures_iter(input) {
println!("{:?}", captures);
}
}
fn main() {
find_c_defines("#define SOME_CONSTANT 42");
} It outputs:
And this works because the derived Looking at the natvis example in the RFC:
This looks like some kind of interpreter is needed for this anyway? How does this work? And why can't So I guess my question boils down to: is supporting |
We've actually prototyped the MIR interpreter approach, all the way through calling Debug imple. It's compelling, it's a great experience. But it's several orders of magnitude more complexity to implement and deploy. There are some nontrivial things to figure out, such as how do you store the MIR after compilation, how do you interpret it against the state of a running program, or even against a crash dump. How do you manage the problem of multiple compiler versions, etc. We'd like to do that work, and do it the right way (open collaboration, RFC, etc.). But meanwhile, in parallel, we felt that NatVis could provide a lot of value, relatively quickly, and with very low complexity cost and low resource cost. We can view them as complementary, overlapping approaches. We can get immediate benefits from NatVis, while working toward the interpreter approach. The interpreter approach also enables some really wild capabilities, such as writing debug extensions directly in source code. Not just Debug impls, but arbitrary debugger commands / extensions, using a dyn trait obj that represents the hosting debugger environment. |
The RFC lays out a few different approaches to the "debugger visualizer" problem but I think they can be mostly summed up in three categories:
Having said all of that, this RFC being accepted does not mean we close the door to options 2 or 3. For Rust to integrate well with the underlying platform, we need to have a solid implementation of option 1, which this RFC provides. It would be great to go above and beyond that as well! |
I see, okay. I think that makes sense. For context, I didn't even know (2) and (3) were distinct choices. So there is a lot that I don't know here. For (1), it sounds like what you're saying is that the natvis file works because the debugger has its own interpreter. To what extent does that work exactly? What would a natvis file for a I think I'm just having a hard time seeing how option (1) materially improves things if folks don't put in the work to maintain these debugging files alongside their crates. Have library authors been consulted about the overall user experience here? (I'm still pretty unclear on it myself. It's hard to even know what's possible exactly.) To be clear, I see what y'all are saying. And the progression here makes sense. And I also agree that having a way to expose the underlying debugger's functionality is also a good idea. And I love the idea of making the debugging UX better. Maybe I'm just jumping in too early here before the docs for how to effectively use this feature are written. :) |
Those are all fair questions! 🙂
I'm not quite sure what you're asking here but from what I've seen, most debuggers offer some kind of "object model" based on the debug info present for a specific type. Since this is based on debuginfo, field privacy doesn't matter since the evaluator/interpreter goes through the debugger itself rather than the Rust compiler/type system.
I thought I'd seen one somewhere in the RFC but I can't find it now. I don't know if @ridwanabdillahi has a suggested implementation of this available or not.
In theory yes, but it has the problems I mentioned above. In optimized programs, the symbol might not exist because it was optimized away (even if used, it might be inlined into all the callsites and then the specific symbol is removed). There's also the problem of ABI but that is solvable.
I think there are reasonable paths to improve things in this regard. Using the features described in this RFC, we could pretty easily imagine some combination of:
It's good feedback and we should make sure we incorporate this into the documentation for the feature! |
@wesleywiser wrote in #3191 (comment):
Perhaps there is a fourth way? I was experimenting with code similar to this: fn show_debug(dyn_debug: &dyn std::fmt::Debug) {
println!("{dyn_debug:?}\n");
} then doing let trait_object: &dyn std::fmt::Debug = &"example value"; you can't do this in gdb: I tried to work around this limitation of gdb by writing a helper function to do this for me, however I was not successful. I can't use a generic function because it might be not monomorphised for the type of the value I want to look at. I experimented with I think gdb should be able to cast to a trait object. Then we might have a simple solution for debugger visualisations with I propose to go upstream and ask for the feature of casting to trait objects. I didn't look at Windows debuggers, but I can imagine that they already have this covered. |
An What follows are my ideas/dreams for the far future of rust debugging. Notably, they have yet to be implemented so it's unclear to me what exactly is required from a technical perspective. It also builds upon scenarios 2/3 from #3191 (comment) Using I'd propose adding a new Trait that can specify exactly this behavior. This would allow users to write rust code once that could then be provided to debuggers. When the trait is not implemented it simply uses the default behavior, similar to how Debug impls should be handled then. This trait would work similarly to Debug. There would be one method to implement and a mutable formatter
Some more thoughts about this
|
Hello! FWIW, if that could simplify things (for |
This RFC adds support for a new Rust attribute that will embed a debugger visualizer into a PDB/ELF.
Internals thread
Rendered