-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proc_macro: Avoid cached TokenStream more often #49852
Conversation
This commit adds even more pessimization to use the cached `TokenStream` inside of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary AST node and transforming it back into a `TokenStream` to hand off to a procedural macro. Such functionality isn't actually implemented in rustc today, so the way `proc_macro` works today is that it stringifies an AST node and then reparses for a list of tokens. This strategy unfortunately loses all span information, so we try to avoid it whenever possible. Implemented in rust-lang#43230 some AST nodes have a `TokenStream` cache representing the tokens they were originally parsed from. This `TokenStream` cache, however, has turned out to not always reflect the current state of the item when it's being tokenized. For example `#[cfg]` processing or macro expansion could modify the state of an item. Consequently we've seen a number of bugs (rust-lang#48644 and rust-lang#49846) related to using this stale cache. This commit tweaks the usage of the cached `TokenStream` to compare it to our lossy stringification of the token stream. If the tokens that make up the cache and the stringified token stream are the same then we return the cached version (which has correct span information). If they differ, however, then we will return the stringified version as the cache has been invalidated and we just haven't figured that out. Closes rust-lang#48644 Closes rust-lang#49846
r? @eddyb (rust_highfive has picked a reviewer for you, use r? to override) |
Note that the real underlying issue here, #43081, I intend to leave open as that'll just get more pressing over time |
I don't mind landing this, but it feels like a hack on top of a hack :-( We should definitely be thinking about a long-term solution. Have you confirmed that the simple equality check works in practice? I could imagine it failing if there has been macro expansion resulting in interpolated tokens or with tokens which are otherwise ignored (comments, whitespace, etc.), etc. Assuming that it works most of the time and it is thus an improvement, even if imperfect then r=me. |
I did some testing locally and it looks like this at least papers over the issue for now. I definitely agree this is piling hacks on hacks :( The only real way forward that I know though is to take a strategy like @bors: r=nrc |
📌 Commit 6d7cfd4 has been approved by |
…r=nrc proc_macro: Avoid cached TokenStream more often This commit adds even more pessimization to use the cached `TokenStream` inside of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary AST node and transforming it back into a `TokenStream` to hand off to a procedural macro. Such functionality isn't actually implemented in rustc today, so the way `proc_macro` works today is that it stringifies an AST node and then reparses for a list of tokens. This strategy unfortunately loses all span information, so we try to avoid it whenever possible. Implemented in rust-lang#43230 some AST nodes have a `TokenStream` cache representing the tokens they were originally parsed from. This `TokenStream` cache, however, has turned out to not always reflect the current state of the item when it's being tokenized. For example `#[cfg]` processing or macro expansion could modify the state of an item. Consequently we've seen a number of bugs (rust-lang#48644 and rust-lang#49846) related to using this stale cache. This commit tweaks the usage of the cached `TokenStream` to compare it to our lossy stringification of the token stream. If the tokens that make up the cache and the stringified token stream are the same then we return the cached version (which has correct span information). If they differ, however, then we will return the stringified version as the cache has been invalidated and we just haven't figured that out. Closes rust-lang#48644 Closes rust-lang#49846
This commit adds even more pessimization to use the cached
TokenStream
insideof an AST node. As a reminder the
proc_macro
API requires taking an arbitraryAST node and transforming it back into a
TokenStream
to hand off to aprocedural macro. Such functionality isn't actually implemented in rustc today,
so the way
proc_macro
works today is that it stringifies an AST node and thenreparses for a list of tokens.
This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a
TokenStream
cache representing the tokens they were originally parsed from. This
TokenStream
cache, however, has turned out to not always reflect the currentstate of the item when it's being tokenized. For example
#[cfg]
processing ormacro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.
This commit tweaks the usage of the cached
TokenStream
to compare it to ourlossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.
Closes #48644
Closes #49846