-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST: Give spans to all identifiers #49154
Conversation
r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) |
cc @jseyfried @nrc @rust-lang/compiler |
👍 |
This looks great! cc @michaelwoerister on potential interactions with incremental hashing. |
Epic PR |
Somewhat odd failure on Travis:
|
/subscribe: follow up after landing by removing as many |
Lifetimes are also represented with |
src/librustc/ich/impls_hir.rs
Outdated
@@ -653,7 +653,7 @@ impl<'a> HashStable<StableHashingContext<'a>> for ast::Ident { | |||
hasher: &mut StableHasher<W>) { | |||
let ast::Ident { | |||
ref name, | |||
ctxt: _ // Ignore this | |||
span: _ // Ignore this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelwoerister
Now I'm pretty sure this is a mistake (and ignoring ctxt
was a mistake as well?).
Spans are hashed in other positions, including the ctxt
part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should be updated! Good catch.
impl Hash for Ident { | ||
fn hash<H: Hasher>(&self, state: &mut H) { | ||
self.name.hash(state); | ||
self.span.ctxt().hash(state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these Hash
and PartialEq
I've kept the old behavior - compare and hash idents as (Name, SyntaxContext)
.
PartialEq
for Ident
should either
- not exist (I'd prefer this, but larger structures containing idents want to derive
PartialEq
andHash
for whatever reasons), or - behave like
PartialEq
for(Name, SyntaxContext)
.PartialEq
not behaving like this is a notable footgun and source of bugs, we had it couple years ago. Well, andHash
should has to be implemented consistently withPartialEq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to derive
PartialEq
andHash
for whatever reasons
What happens if you just remove those? Is PartialEq
it used only for tests? Could those tests serialize to JSON, use pattern matching, or do something else instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try.
IIRC, at least token::Token
contains Ident
and uses ==
all the time for "simple" variants not actually containing Ident
s or other data (the case that would be perfectly supported by is
, btw).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's a bit of a bummer - and yeah, I agree about is
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, should Token
contain Ident
, or just Symbol
(and keep the Span
separate)? Or is that too much hassle / not readily supported by existing infrastructure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If lots of spans get moved from the regular HIR nodes to Ident
and then we don't hash them, that would not be good. But the HashStable
implementation for Ident
doesn't ignore the span, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, StableHash
doesn't ignore the span (well, after fixing #49154 (comment)).
Also,
regular HIR nodes
this PR doesn't touch HIR yet, only AST.
@petrochenkov very cool |
☔ The latest upstream changes (presumably #49308) made this pull request unmergeable. Please resolve the merge conflicts. |
Updated (there are couple of new commits). |
☔ The latest upstream changes (presumably #49101) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping from triage, @eddyb ! Will you have time to review this soon? FYI, @petrochenkov — you've got some merge conflicts. |
|
} = *self; | ||
|
||
name.hash_stable(hcx, hasher); | ||
span.hash_stable(hcx, hasher); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @michaelwoerister (making sure you see this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mw previously confirmed the span should be hashed here - #49154 (comment)
src/libsyntax/feature_gate.rs
Outdated
@@ -1774,11 +1774,14 @@ impl<'a> Visitor<'a> for PostExpansionVisitor<'a> { | |||
|
|||
fn visit_path(&mut self, path: &'a ast::Path, _id: NodeId) { | |||
for segment in &path.segments { | |||
// Context of ident spans cannot be overriden to ignore unstable features, | |||
// so replace it with path span context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what's happening here. Do you have an example where the behavior changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fix for the regression caught by Travis and mentioned in #49154 (comment).
The problem is that here we want two different SyntaxContext
s for identifiers generated by #[test]
- one context for name resolution (unhygienic) and another one for stability checking (hygienic, stability is not checked), but Ident
has only one SyntaxContext
.
So we can keep the stability checking context in something that is not used for name resolution, for example whole Path
(or anything else that is not an identifier/lifetime).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Can the source comment be expanded to say more of what is happening from its own perspective? e.g. starting with "Here we check the span from the whole Path instead of that of individual Idents specifically because only the former can be ...".
Updated. |
AST: Give spans to all identifiers Change representation of `ast::Ident` from `{ name: Symbol, ctxt: SyntaxContext }` to `{ name: Symbol, span: Span }`. Syntax contexts still can be extracted from spans (`span.ctxt()`). Why this should not require more memory: - `Span` is `u32` just like `SyntaxContext`. - Despite keeping more spans in AST we don't actually *create* more spans, so the number of "outlined" spans kept in span interner shouldn't become larger. Why this may be slightly slower: - When we need to extract ctxt from an identifier instead of just field read we need to do bit field extraction possibly followed by and access by index into span interner's vector. Both operations should be fast (unless the span interner is under some synchronization) and we already do ctxt extraction from spans all the time during macro expansion, so the difference should be lost in noise. cc #48842 (comment)
☀️ Test successful - status-appveyor, status-travis |
Is it possible that this PR regressed expansion info on derives? The code generated by #[derive(Debug)]
pub enum Error {
Type(
&'static str,
),
} contains a |
@oli-obk In this specific example though, it looks like const __self_0: u8 = 0;
#[derive(Debug)]
pub enum Error {
Type(
&'static str,
),
}
fn main() {}
---- On stable
error[E0530]: match bindings cannot shadow constants
--> src/main.rs:6:9
|
1 | const __self_0: u8 = 0;
| ----------------------- a constant `__self_0` is defined here
...
6 | &'static str,
| ^^^^^^^^^^^^^ cannot be named the same as a constant Or you are talking about EDIT: I see, it's about of whole |
@petrochenkov the current reproduction is a clippy lint reporting things that happened inside the expansion, but reporting it at the span of the type. The relevant lint is https://github.com/rust-lang-nursery/rust-clippy/blob/master/clippy_lints/src/needless_borrow.rs#L86 We do check for macro expansions here, which uses the |
We took the span from a |
Discovered in rust-lang#50061 we're falling off the "happy path" of using a stringified token stream more often than we should. This was due to the fact that a user-written token like `0xf` is equality-different from the stringified token of `15` (despite being semantically equivalent). This patch updates the call to `eq_unspanned` with an even more awful solution, `probably_equal_for_proc_macro`, which ignores the value of each token and basically only compares the structure of the token stream, assuming that the AST doesn't change just one token at a time. While this is a step towards fixing rust-lang#50061 there is still one regression from rust-lang#49154 which needs to be fixed.
proc_macro: Stay on the "use the cache" path more Discovered in #50061 we're falling off the "happy path" of using a stringified token stream more often than we should. This was due to the fact that a user-written token like `0xf` is equality-different from the stringified token of `15` (despite being semantically equivalent). This patch updates the call to `eq_unspanned` with an even more awful solution, `probably_equal_for_proc_macro`, which ignores the value of each token and basically only compares the structure of the token stream, assuming that the AST doesn't change just one token at a time. While this is a step towards fixing #50061 there is still one regression from #49154 which needs to be fixed.
Change representation of
ast::Ident
from{ name: Symbol, ctxt: SyntaxContext }
to{ name: Symbol, span: Span }
.Syntax contexts still can be extracted from spans (
span.ctxt()
).Why this should not require more memory:
Span
isu32
just likeSyntaxContext
.Why this may be slightly slower:
cc #48842 (comment)