-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proc_macro/bridge: stop using a remote object handle for proc_macro Punct and Group #98188
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
(rust-highfive has picked a reviewer for you, use r? to override) |
r? @eddyb |
☔ The latest upstream changes (presumably #98186) made this pull request unmergeable. Please resolve the merge conflicts. |
2dd6d7d
to
51b2749
Compare
proc_macro/bridge: cache static spans in proc_macro's client thread-local state This is the second part of rust-lang#86822, split off as requested in rust-lang#86822 (review). This patch removes the RPC calls required for the very common operations of `Span::call_site()`, `Span::def_site()` and `Span::mixed_site()`. Some notes: This part is one of the ones I don't love as a final solution from a design standpoint, because I don't like how the spans are serialized immediately at macro invocation. I think a more elegant solution might've been to reserve special IDs for `call_site`, `def_site`, and `mixed_site` at compile time (either starting at 1 or from `u32::MAX`) and making reading a Span handle automatically map these IDs to the relevant values, rather than doing extra serialization. This would also have an advantage for potential future work to allow `proc_macro` to operate more independently from the compiler (e.g. to reduce the necessity of `proc-macro2`), as methods like `Span::call_site()` could be made to function without access to the compiler backend. That was unfortunately tricky to do at the time, as this was the first part I wrote of the patches. After the later part (rust-lang#98188, rust-lang#98189), the other uses of `InternedStore` are removed meaning that a custom serialization strategy for `Span` is easier to implement. If we want to go that path, we'll still need the majority of the work to split the bridge object and introduce the `Context` trait for free methods, and it will be easier to do after `Span` is the only user of `InternedStore` (after rust-lang#98189).
This greatly reduces round-trips to fetch relevant extra information about the token in proc macro code, and avoids RPC messages to create Punct tokens.
This greatly reduces round-trips to fetch relevant extra information about the token in proc macro code, and avoids RPC messages to create Group tokens.
Before I forget (please don't force-push while this is ongoing): @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit f28dfdf with merge 0b6ba9689c4b846a039c5ba15f8fcde0251749ed... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, mostly nits as usual (reminder not to force push until perf run is finished, I think bors gets confused easily).
const LEGAL_CHARS: &[char] = &[ | ||
'=', '<', '>', '!', '~', '+', '-', '*', '/', '%', '^', '&', '|', '@', '.', ',', ';', | ||
':', '#', '$', '?', '\'', | ||
]; | ||
if !LEGAL_CHARS.contains(&ch) { | ||
panic!("unsupported character `{:?}`", ch); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't something like this be a match
? That way, LLVM can optimize it to some arithmetic and bit twiddling?
(Oops, just checked and this is what it takes: https://godbolt.org/z/vP13eKjsW - I guess that's pretty nasty)
If you are going to use an array search, can you do an ASCII check first and make it a bytestring literal? So that it at least can hit some kind of memchr
specialization or w/e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's an easy change. The current code is literally verbatim cut-pasted from the code which used to be in proc_macro_server.rs
:
rust/compiler/rustc_expand/src/proc_macro_server.rs
Lines 301 to 307 in bd2e51a
const LEGAL_CHARS: &[char] = &[ | |
'=', '<', '>', '!', '~', '+', '-', '*', '/', '%', '^', '&', '|', '@', '.', ',', ';', | |
':', '#', '$', '?', '\'', | |
]; | |
if !LEGAL_CHARS.contains(&ch) { | |
panic!("unsupported character `{:?}`", ch) | |
} |
FWIW the current code appears to actually compile to a somewhat efficient jump table (https://godbolt.org/z/P9d5366YT), and sloppily switching it to use &[u8]
instead does appear to make it use memchr (https://godbolt.org/z/EfvaaMbbh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing codegen seems like the nicest of the options (I wouldn't be too surprised if it performs better than the memchr variant), so I'm inclined to stick with it.
Interpolated(nt) => { | ||
let stream = TokenStream::from_nonterminal_ast(&nt); | ||
if crate::base::nt_pretty_printing_compatibility_hack(&nt, rustc.sess()) { | ||
trees.extend(Self::from_internal((stream, rustc))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this Self::from_internal
didn't go through a weird method trait, I would definitely suggest passing &mut Vec
here (would help with any future experiments like using SmallVec<[_; 32]>
instead of Vec
, as there would be no expensive moves).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'm not sure this codepath is worth optimizing given it's only for that pretty-printing compatibility hack, meaning it only impacts enums with the name ProceduralMasqueradeDummyType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, though in the SmallVec<[_; 32]>
case, I'm worried the rarely-used codepath would still cause worse cache behavior for the whole function.
☀️ Try build successful - checks-actions |
Queued 0b6ba9689c4b846a039c5ba15f8fcde0251749ed with parent bd2e51a, future comparison URL. |
Finished benchmarking commit (0b6ba9689c4b846a039c5ba15f8fcde0251749ed): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me with at least the comment fixed (and the PR description updated)
longer names for RPC generics and reduced dependency on macros in the server.
This comment was marked as off-topic.
This comment was marked as off-topic.
@bors r+ |
📌 Commit 64a7d57 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (94e9374): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
Unbreak stage1 tests via ignore-stage1 in `proc-macro/invalid-punct-ident-1.rs`. rust-lang#98188 broke `./x.py test --stage 1` (which I thought we ran in PR CI, cc `@rust-lang/infra)` i.e. the default `./x.py test` in dev checkouts, as the panic in `src/test/ui/proc-macro/invalid-punct-ident-1.rs` moved from the server (`rustc`) to the client (proc macro), and that means it's now affected by rust-lang#59998. I made the test look like `src/test/ui-fulldeps/issue-76270-panic-in-libproc-macro.rs` tho I'm a bit confused why that one is in `src/test/ui-fulldeps`, it should still work in `src/test/ui`, no? (cc `@Aaron1011)`
…idge This is done by having the crossbeam dependency inserted into the proc_macro server code from the server side, to avoid adding a dependency to proc_macro. In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution. In local testing, the crossbeam executor was substantially more performant than either of the two existing CrossThread strategies, so they have been removed to keep things simple.
proc_macro: use crossbeam channels for the proc_macro cross-thread bridge This is done by having the crossbeam dependency inserted into the `proc_macro` server code from the server side, to avoid adding a dependency to `proc_macro`. In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution. In local testing, the crossbeam executor was substantially more performant than either of the two existing `CrossThread` strategies, so they have been removed to keep things simple. r? `@eddyb`
This is the third part of #86822, split off as requested in #86822 (review). This patch transforms the
Punct
andGroup
types into structs serialized over IPC rather than handles, making them more efficient to create and manipulate from within proc-macros.