Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move def id collection and extern crate handling to before AST->HIR lowering #33089

Merged
merged 9 commits into from
Apr 22, 2016

Conversation

nrc
Copy link
Member

@nrc nrc commented Apr 19, 2016

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

So, uhm, what about the elephant in the room? The AST changes during expansion, do you still plan on using IDs even with resolve-while-expanding?
If you have the plan, I'm quite curious, but atm I'm not seeing how this isn't a step backwards.

@nrc
Copy link
Member Author

nrc commented Apr 19, 2016

@eddyb - this stuff happens after expansion, but before lowering, not before expansion. This would allow us to use def ids in the HIR and name resolution as part of lowering. It isn't intended to directly allow def ids for use in name resolution in expansion. If we do want to use def ids for that, then I imagine we continue to collect def ids during expansion, in the same way that we do for lowering here. I think that works because def ids are monotonic - once we are able to assign a def id to a node, it will never change due to further expansion.

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

@nrc I think this is relying way too much on there being IDs embedded in nodes to begin with.

However, using name resolution during lowering (or rather, turning resolve into the lowering pass) is great, so I'm happy in that regard, just worried about tight coupling between DefId and NodeId.

One step at a time, I guess.

@jseyfried
Copy link
Contributor

@eddyb

relying way too much on there being IDs embedded in nodes

What is the motivation for removing the IDs in AST nodes?

We will need to compute scopes and resolve import and macro paths on the AST during expansion. Without embedded IDs, I believe we would have to come up with a node identification system that would not be invalidated by expansion. This is certainly doable, but it would add considerable complexity.

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

@jseyfried The only reason the whole "assign IDs and build a map" thing works is because the AST (now HIR) is frozen at that point.
How do you plan to access nodes based on IDs in a constantly folding AST without fundamentally changing the representation?

@bors
Copy link
Collaborator

bors commented Apr 19, 2016

☔ The latest upstream changes (presumably #33002) made this pull request unmergeable. Please resolve the merge conflicts.

@jseyfried
Copy link
Contributor

jseyfried commented Apr 19, 2016

@eddyb

How do you plan to access nodes based on IDs in a constantly folding AST

By relying on @nrc's monotonicity property. Since macro definitions and invocations are not monotonic, I plan on using a separate "id namespace" for them or assigning them normally and then reassigning ids after expansion to avoid unused ids (this would still be simpler than a custom identification system, imo).

More specifically, I plan on keeping track of the next normal id and the next macro invocation id during expansion. After each expansion, we would assign all unassigned nodes appropriately and collect new def ids.

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

@jseyfried So the "catch" is that you can waste IDs. What do you think about storing the entire AST in monotonic vectors instead of using pointers?
As long as the AST isn't inspected and modified routinely after parsing/manual creation (with the exception of expansion), the AST could be created with an identity, since we seem to want that.
Having smaller IDs instead of pointers should help with the memory usage (which is a tad bit ridiculous, compared to the actual information it's expressing).

An alternative which still allows random access would be a flattened stream of "commands" which form the AST, with positions in that stream as the pre-HIR/Ty identification.
Think ADT+sexprs with no pointers, just a byte buffer. It's a long shot, don't get me wrong, just want to know if that kind of experimentation will even be possible.

@jseyfried
Copy link
Contributor

jseyfried commented Apr 19, 2016

So the "catch" is that you can waste IDs.

I don't think we would waste any ids if we use a separate "id namespace" for macro invocations and if we always remove unconfigured items before assigning new ids.

@jseyfried
Copy link
Contributor

jseyfried commented Apr 19, 2016

What do you think about storing the entire AST in monotonic vectors instead of using pointers?

I definitely like this idea, although it would be a massive plugin-[breaking-change], of course.

a flattened stream of "commands" which form the AST, with positions in that stream as the pre-HIR/Ty identification.

I think this would be possible, but I'm not sure it would be worth the complexity. Specifically, I think it would be a challenge to remain backwards compatible with the current macro system, in which the expansion order is significant.

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

@jseyfried I thought the plan was to ditch expansion order? But I agree, none of what I said is trivial.
Thanks for taking the time to explain it, I see how this can unfold now.
Definitely not as bad as some of the non-O(1) identity solutions.

The changes in this PR LGTM, although I can't wait to get rid of the AST/HIR embedded in crate metadata.

@jseyfried
Copy link
Contributor

jseyfried commented Apr 19, 2016

@eddyb

I thought the plan was to ditch expansion order?

It is, but we still need to support, for example:

macro_rules! foo { () => {} }
foo! {} // resolves to ::foo
#[macro_use] mod bar {
    macro_rules! foo { () => {} }
}
foo! {} // resolves to ::bar::foo

Thinking about this some more, I don't think it would be too hard to support this with a "stream of commands" based AST since we should be able to perform all "old-style" resolutions during the first round of expansion if we expand "depth first" as we currently do (i.e. immediately re-expand macro-generated AST instead of waiting for the next expansion round).

@nikomatsakis
Copy link
Contributor

@jseyfried you kind of lost me a bit -- why do we need to have two namespaces to cope with macro expansion?

@jseyfried
Copy link
Contributor

jseyfried commented Apr 20, 2016

@nikomatsakis
We need two namespaces not for correctness but to avoid wasting node ids on macro invocations, which never end up in the HIR and so only need ids during expansion.

Immediately after expanding a macro, I'm planning to remove unconfigured items and assign node ids. With the exception of macro invocations, all the assigned nodes will eventually become part of the HIR, so their ids won't be wasted. If we assigned ids for macro invocations independently of other nodes (i.e. we use a separate NodeIdAssigner for macro invocations), then we could identify an AST node with a type like:

enum AstNodeId {
    MacroInvocation(NodeId),
    OtherNode(NodeId),
}

We could also start the macro invocation id assigner at 0x80000000 and then use raw node ids to identify AST nodes (provided that there will not be more than 2^31 nodes, of course).

false,
result_ident,
match_expr,
None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this doesn't need to be changed (unless you prefer the new formatting -- I usually prefer to save vertical space)

@jseyfried
Copy link
Contributor

jseyfried commented Apr 21, 2016

Reviewed, LGTM modulo optional comments.

I wasn't very familiar with of def id collection before reviewing this, so I might not be the best qualified to approve -- r? @eddyb

@eddyb
Copy link
Member

eddyb commented Apr 21, 2016

@bors r+

@bors
Copy link
Collaborator

bors commented Apr 21, 2016

📌 Commit 0be3c8c has been approved by eddyb

@bors
Copy link
Collaborator

bors commented Apr 21, 2016

⌛ Testing commit 0be3c8c with merge 34886d6...

@bors
Copy link
Collaborator

bors commented Apr 21, 2016

💔 Test failed - auto-win-msvc-64-opt-mir

@alexcrichton
Copy link
Member

@bors: retry

On Thu, Apr 21, 2016 at 10:52 AM, bors notifications@github.com wrote:

[image: 💔] Test failed - auto-win-msvc-64-opt-mir
http://buildbot.rust-lang.org/builders/auto-win-msvc-64-opt-mir/builds/386


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#33089 (comment)

@bors
Copy link
Collaborator

bors commented Apr 22, 2016

⌛ Testing commit 0be3c8c with merge a264f5b...

bors added a commit that referenced this pull request Apr 22, 2016
Move def id collection and extern crate handling to before AST->HIR lowering

r? @jseyfried, @eddyb, or @nikomatsakis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants