-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support demangling the new Rust mangling scheme (v0). #23
Conversation
Introduce Rust symbol mangling scheme. This is an implementation of a "feature-complete" Rust mangling scheme, in the vein of rust-lang/rfcs#2603 - but with some differences, see rust-lang/rfcs#2603 (comment) for details. The demangling implementation PR is rust-lang/rustc-demangle#23 (this PR already uses it via a git dependency, to allow testing). Discussion of the *design* of the mangling scheme should still happen on the RFC, but this PR's specific implementation details can be reviewed in parallel. <hr/> *Notes for reviewers*: * only the last 4 commits are specific to this branch, if necessary I can open a separate PR for everything else (it was meant to be its own small refactoring, but it got a bit out of hand) * the "TEMPORARY" commit is only there because it does some extra validation (comparing the demangling from `rustc-demangle` to the compiler's pretty-printing, adjusted slightly to produce the same output), that I would like to try on crater * there is the question of whether we should turn on the new mangling now, wait for tools to support it (I'm working on that), and/or have it under a `-Z` flag for now r? @nikomatsakis / @michaelwoerister cc @rust-lang/compiler
This looks fantastic to me, thanks so much for doing this @eddyb! No need to get it working that far back, but works either way :) I'll give the new pieces a closer review as well when the Rust PR is closer to landing |
Introduce Rust symbol mangling scheme. This is an implementation of a "feature-complete" Rust mangling scheme, in the vein of rust-lang/rfcs#2603 - but with some differences, see rust-lang/rfcs#2603 (comment) for details. The demangling implementation PR is rust-lang/rustc-demangle#23 (this PR already uses it via a git dependency, to allow testing). Discussion of the *design* of the mangling scheme should still happen on the RFC, but this PR's specific implementation details can be reviewed in parallel. <hr/> *Notes for reviewers*: * only the last 6 commits are specific to this branch, if necessary I can open a separate PR for everything else (it was meant to be its own small refactoring, but it got a bit out of hand) * the "TEMPORARY" commit is only there because it does some extra validation (comparing the demangling from `rustc-demangle` to the compiler's pretty-printing, adjusted slightly to produce the same output), that I would like to try on crater * there is the question of whether we should turn on the new mangling now, wait for tools to support it (I'm working on that), and/or have it under a `-Z` flag for now r? @nikomatsakis / @michaelwoerister cc @rust-lang/compiler
@alexcrichton We want to land the PR soon, and I've updated the naming of the new mangling scheme from I don't think I want to change it any further now, so it should be fine to review now. |
The code all here looks great to me, thanks again for this @eddyb! My main comment would be that I don't really know how to verify the correctness of this or otherwise take a look at a reference somewhere. Is there reference documentation to put somewhere? Or otherwise could comments be inserted in the code namely around some Other than that I think this is good to merge and publish! |
@alexcrichton So rust-lang/rfcs#2603 (comment) has a grammar for this mangling scheme, which @michaelwoerister added to the RFC itself, and I could put comments on relevant functions containing the respective grammar rules, if you think it would help. |
As for correctness: this implementation was verified against an expected demangling (see the Also, a C implementation (which we might want to host in this repo, not sure) was also verified against the stc+rustc+Cargo dataset (1M symbols), and found to produce identical results. EDIT: none of this, of course, means the mangler+demangler obey the spec, only that if one of them does, the other one likely does too. |
Ok that all sounds good to me, thanks! |
Introduce Rust symbol mangling scheme. This is an implementation of a "feature-complete" Rust mangling scheme, in the vein of rust-lang/rfcs#2603 ~~- but with some differences, see rust-lang/rfcs#2603 (comment) for details~~ (@michaelwoerister integrated my proposed changes into the RFC itself). On nightly, you can now control the mangling scheme with `-Z symbol-mangling-version`, which can be: * `legacy`: the older mangling version, still the default currently * `v0`: the new RFC mangling version, as implemented by this PR To test the new mangling, set `RUSTFLAGS=-Zsymbol-mangling-version=v0` (or change [`rustflags` in `.cargo/config.toml`](https://doc.rust-lang.org/cargo/reference/config.html#configuration-keys)). Please note that only symbols from crates built with that flag will use the new mangling, and that tool support (e.g. debuggers) will be limited initially, and it may take a while for everything to be upstreamed. However, `RUST_BACKTRACE` should work out of the box with either mangling version. <hr/> The demangling implementation PR is rust-lang/rustc-demangle#23 ~~(this PR already uses it via a git dependency, to allow testing)~~. Discussion of the *design* of the mangling scheme should still happen on the RFC, but this PR's specific implementation details can be reviewed in parallel. *Notes for reviewers*: * ~~only the last 6 commits are specific to this branch, if necessary I can open a separate PR for everything else (it was meant to be its own small refactoring, but it got a bit out of hand)~~ ~~based on #58140~~ * the "harness" commit is only there because it does some extra validation (comparing the demangling from `rustc-demangle` to the compiler's pretty-printing, adjusted slightly to produce the same output), that I would like to try on crater * ~~there is the question of whether we should turn on the new mangling now, wait for tools to support it (I'm working on that), and/or have it under a `-Z` flag for now~~ (we're gating this on `-Z symbol-mangling-version=v0`, see above) r? @nikomatsakis / @michaelwoerister cc @rust-lang/compiler
This is the libiberty (mainly for binutils/gdb) counterpart of rust-lang/rustc-demangle#23. Relevant links for the new Rust mangling scheme (aka "v0"): * Rust RFC: rust-lang/rfcs#2603 * tracking issue: rust-lang/rust#60705 * implementation: rust-lang/rust#57967 This implementation includes full support for UTF-8 identifiers via punycode, so I've included a testcase for that as well. libiberty/ChangeLog: * rust-demangle.c (struct rust_demangler): Add skipping_printing and bound_lifetime_depth fields. (eat): Add (v0-only). (parse_integer_62): Add (v0-only). (parse_opt_integer_62): Add (v0-only). (parse_disambiguator): Add (v0-only). (struct rust_mangled_ident): Add punycode{,_len} fields. (parse_ident): Support v0 identifiers. (print_str): Respect skipping_printing. (print_uint64): Add (v0-only). (print_uint64_hex): Add (v0-only). (print_ident): Respect skipping_printing, Support v0 identifiers. (print_lifetime_from_index): Add (v0-only). (demangle_binder): Add (v0-only). (demangle_path): Add (v0-only). (demangle_generic_arg): Add (v0-only). (demangle_type): Add (v0-only). (demangle_path_maybe_open_generics): Add (v0-only). (demangle_dyn_trait): Add (v0-only). (demangle_const): Add (v0-only). (demangle_const_uint): Add (v0-only). (basic_type): Add (v0-only). (rust_demangle_callback): Support v0 symbols. * testsuite/rust-demangle-expected: Add v0 testcases.
This is the libiberty (mainly for binutils/gdb) counterpart of rust-lang/rustc-demangle#23. Relevant links for the new Rust mangling scheme (aka "v0"): * Rust RFC: rust-lang/rfcs#2603 * tracking issue: rust-lang/rust#60705 * implementation: rust-lang/rust#57967 This implementation includes full support for UTF-8 identifiers via punycode, so I've included a testcase for that as well. libiberty/ChangeLog 2021-01-16 Eduard-Mihai Burtescu <eddyb@lyken.rs> * rust-demangle.c (struct rust_demangler): Add skipping_printing and bound_lifetime_depth fields. (eat): Add (v0-only). (parse_integer_62): Add (v0-only). (parse_opt_integer_62): Add (v0-only). (parse_disambiguator): Add (v0-only). (struct rust_mangled_ident): Add punycode{,_len} fields. (parse_ident): Support v0 identifiers. (print_str): Respect skipping_printing. (print_uint64): Add (v0-only). (print_uint64_hex): Add (v0-only). (print_ident): Respect skipping_printing, Support v0 identifiers. (print_lifetime_from_index): Add (v0-only). (demangle_binder): Add (v0-only). (demangle_path): Add (v0-only). (demangle_generic_arg): Add (v0-only). (demangle_type): Add (v0-only). (demangle_path_maybe_open_generics): Add (v0-only). (demangle_dyn_trait): Add (v0-only). (demangle_const): Add (v0-only). (demangle_const_uint): Add (v0-only). (basic_type): Add (v0-only). (rust_demangle_callback): Support v0 symbols. * testsuite/rust-demangle-expected: Add v0 testcases.
This PR complements rust-lang/rust#57967 (ideally, they should be reviewed/merged in tandem).
See rust-lang/rfcs#2603 (comment) for details on the implemented mangling scheme.
Aside from testing on every symbol from a Rust stage1 build (of std, rustc and cargo), I've also confirmed this builds with Rust 1.6 (AFAICT the earliest stable version supporting
#![no_std]
).There are some debatable decisions, such as the value of
SMALL_PUNYCODE_LEN
, or how to print certain details, I'm not sure whether this PR or the RFC is a good place to discuss those.cc @nikomatsakis @michaelwoerister