Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a chapter on all the identifiers used through rustc #872

Merged
merged 4 commits into from
Sep 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
- [THIR and MIR construction](./mir/construction.md)
- [MIR visitor and traversal](./mir/visitor.md)
- [MIR passes: getting the MIR for a function](./mir/passes.md)
- [Identifiers in the Compiler](./identifiers.md)
- [Closure expansion](./closure.md)

# Analysis
Expand Down
46 changes: 10 additions & 36 deletions src/hir.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,45 +68,18 @@ the compiler a chance to observe that you accessed the data for

### Identifiers in the HIR

Most of the code that has to deal with things in HIR tends not to
carry around references into the HIR, but rather to carry around
*identifier numbers* (or just "ids"). Right now, you will find four
sorts of identifiers in active use:

- [`DefId`], which primarily names "definitions" or top-level items.
- You can think of a [`DefId`] as being shorthand for a very explicit
and complete path, like `std::collections::HashMap`. However,
these paths are able to name things that are not nameable in
normal Rust (e.g. impls), and they also include extra information
about the crate (such as its version number, as two versions of
the same crate can co-exist).
- A [`DefId`] really consists of two parts, a `CrateNum` (which
identifies the crate) and a `DefIndex` (which indexes into a list
of items that is maintained per crate).
- [`HirId`], which combines the index of a particular item with an
offset within that item.
- the key point of a [`HirId`] is that it is *relative* to some item
(which is named via a [`DefId`]).
- [`BodyId`], this is an identifier that refers to a specific
body (definition of a function or constant) in the crate. It is currently
effectively a "newtype'd" [`HirId`].
- [`NodeId`], which is an absolute id that identifies a single node in the HIR
tree.
- While these are still in common use, **they are being slowly phased out**.
- Since they are absolute within the crate, adding a new node anywhere in the
tree causes the [`NodeId`]s of all subsequent code in the crate to change.
This is terrible for incremental compilation, as you can perhaps imagine.
There are a bunch of different identifiers to refer to other nodes or definitions
in the HIR. In short:
- A [`DefId`] refers to a *definition* in any crate.
- A [`LocalDefId`] refers to a *definition* in the currently compiled crate.
- A [`HirId`] refers to *any node* in the HIR.

For more detailed information, check out the [chapter on identifiers][ids].

[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
[`LocalDefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.LocalDefId.html
[`HirId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir_id/struct.HirId.html
[`BodyId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.BodyId.html
[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html

We also have an internal map to go from `DefId` to what’s called "Def path". "Def path" is like a
module path but a bit more rich. For example, it may be `crate::foo::MyStruct` that identifies
this definition uniquely. It’s a bit different than a module path because it might include a type
parameter `T`, which you can't write in normal rust, like `crate::foo::MyStruct::T`. These are used
in incremental compilation.
[ids]: ./identifiers.md#in-the-hir

### The HIR Map

Expand All @@ -129,6 +102,7 @@ something outside of the current crate (since then it has no HIR
node), but otherwise returns `Some(n)` where `n` is the node-id of the
definition.

[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
[as_local_node_id]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.as_local_node_id

Similarly, you can use [`tcx.hir.find(n)`][find] to lookup the node for a
Expand Down
63 changes: 63 additions & 0 deletions src/identifiers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Identifiers in the Compiler

If you have read the few previous chapters, you now know that `rustc` uses
many different intermediate representations to perform different kinds of analyses.
However, like in every data structure, you need a way to traverse the structure
and refer to other elements. In this chapter, you will find information on the
different identifiers `rustc` uses for each intermediate representation.

## In the AST

A [`NodeId`] is an identifier number that uniquely identifies an AST node within
a crate. Every node in the AST has its own [`NodeId`], including top-level items
such as structs, but also individual statements and expressions.

However, because they are absolute within in a crate, adding or removing a single
node in the AST causes all the subsequent [`NodeId`]s to change. This renders
[`NodeId`]s pretty much useless for incremental compilation, where you want as
few things as possible to change.

[`NodeId`]s are used in all the `rustc` bits that operate directly on the AST,
like macro expansion and name resolution.

[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html

## In the HIR

The HIR uses a bunch of different identifiers that coexist and serve different purposes.

- A [`DefId`], as the name suggests, identifies a particular definition, or top-level
item, in a given crate. It is composed of two parts: a [`CrateNum`] which identifies
the crate the definition comes from, and a [`DefIndex`] which identifies the definition
within the crate. Unlike [`NodeId`]s, there isn't a [`DefId`] for every expression, which
makes them more stable across compilations.
- A [`LocalDefId`] is basically a [`DefId`] that is known to come from the current crate.
This allows us to drop the [`CrateNum`] part, and use the type system to ensure that
only local definitions are passed to functions that expect a local definition.
- A [`HirId`] uniquely identifies a node in the HIR of the current crate. It is composed
of two parts: an `owner` and a `local_id` that is unique within the `owner`. This
combination makes for more stable values which are helpful for incremental compilation.
Unlike [`DefId`]s, a [`HirId`] can refer to [fine-grained entities][Node] like expressions,
but stays local to the current crate.
- A [`BodyId`] identifies a HIR [`Body`] in the current crate. It is currenty only
a wrapper around a [`HirId`]. For more info about HIR bodies, please refer to the
[HIR chapter][hir-bodies].

These identifiers can be converted into one another through the [HIR map][map].
See the [HIR chapter][hir-map] for more detailed information.

[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
[`LocalDefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.LocalDefId.html
[`HirId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir_id/struct.HirId.html
[`BodyId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.BodyId.html
[`CrateNum`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/enum.CrateNum.html
[`DefIndex`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefIndex.html
[`Body`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Body.html
[Node]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/enum.Node.html
[hir-map]: ./hir.md#the-hir-map
[hir-bodies]: ./hir.md#hir-bodies
[map]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html

## In the MIR

**TODO**