Skip to content

Commit e8c337b

Browse files
committedJan 6, 2016
Auto merge of #30532 - nikomatsakis:cross-item-dependencies, r=mw
This is roughly the same as my previous PR that created a dependency graph, but that: 1. The dependency graph is only optionally constructed, though this doesn't seem to make much of a difference in terms of overhead (see measurements below). 2. The dependency graph is simpler (I combined a lot of nodes). 3. The dependency graph debugging facilities are much better: you can now use `RUST_DEP_GRAPH_FILTER` to filter the dep graph to just the nodes you are interested in, which is super help. 4. The tests are somewhat more elaborate, including a few known bugs I need to fix in a second pass. This is potentially a `[breaking-change]` for plugin authors. If you are poking about in tcx state or something like that, you probably want to add `let _ignore = tcx.dep_graph.in_ignore();`, which will cause your reads/writes to be ignored and not affect the dep-graph. After this, or perhaps as an add-on to this PR in some cases, what I would like to do is the following: - [x] Write-up a little guide to how to use this system, the debugging options available, and what the possible failure modes are. - [ ] Introduce read-only and perhaps the `Meta` node - [x] Replace "memoization tasks" with node from the map itself - [ ] Fix the shortcomings, obviously! Notably, the HIR map needs to register reads, and there is some state that is not yet tracked. (Maybe as a separate PR.) - [x] Refactor the dep-graph code so that the actual maintenance of the dep-graph occurs in a parallel thread, and the main thread simply throws things into a shared channel (probably a fixed-size channel). There is no reason for dep-graph construction to be on the main thread. (Maybe as a separate PR.) Regarding performance: adding this tracking does add some overhead, approximately 2% in my measurements (I was comparing the build times for rustdoc). Interestingly, enabling or disabling tracking doesn't seem to do very much. I want to poke at this some more and gather a bit more data -- in some tests I've seen that 2% go away, but on others it comes back. It's not entirely clear to me if that 2% is truly due to constructing the dep-graph at all. The next big step after this is write some code to dump the dep-graph to disk and reload it. r? @michaelwoerister
2 parents 21b025f + 93996b1 commit e8c337b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+2641
-424
lines changed
 

‎src/librustc/dep_graph/README.md

+390
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# Dependency graph for incremental compilation
2+
3+
This module contains the infrastructure for managing the incremental
4+
compilation dependency graph. This README aims to explain how it ought
5+
to be used. In this document, we'll first explain the overall
6+
strategy, and then share some tips for handling specific scenarios.
7+
8+
The high-level idea is that we want to instrument the compiler to
9+
track which parts of the AST and other IR are read/written by what.
10+
This way, when we come back later, we can look at this graph and
11+
determine what work needs to be redone.
12+
13+
### The dependency graph
14+
15+
The nodes of the graph are defined by the enum `DepNode`. They represent
16+
one of three things:
17+
18+
1. HIR nodes (like `Hir(DefId)`) represent the HIR input itself.
19+
2. Data nodes (like `ItemSignature(DefId)`) represent some computed
20+
information about a particular item.
21+
3. Procedure notes (like `CoherenceCheckImpl(DefId)`) represent some
22+
procedure that is executing. Usually this procedure is
23+
performing some kind of check for errors. You can think of them as
24+
computed values where the value being computed is `()` (and the
25+
value may fail to be computed, if an error results).
26+
27+
An edge `N1 -> N2` is added between two nodes if either:
28+
29+
- the value of `N1` is used to compute `N2`;
30+
- `N1` is read by the procedure `N2`;
31+
- the procedure `N1` writes the value `N2`.
32+
33+
The latter two conditions are equivalent to the first one if you think
34+
of procedures as values.
35+
36+
### Basic tracking
37+
38+
There is a very general strategy to ensure that you have a correct, if
39+
sometimes overconservative, dependency graph. The two main things you have
40+
to do are (a) identify shared state and (b) identify the current tasks.
41+
42+
### Identifying shared state
43+
44+
Identify "shared state" that will be written by one pass and read by
45+
another. In particular, we need to identify shared state that will be
46+
read "across items" -- that is, anything where changes in one item
47+
could invalidate work done for other items. So, for example:
48+
49+
1. The signature for a function is "shared state".
50+
2. The computed type of some expression in the body of a function is
51+
not shared state, because if it changes it does not itself
52+
invalidate other functions (though it may be that it causes new
53+
monomorphizations to occur, but that's handled independently).
54+
55+
Put another way: if the HIR for an item changes, we are going to
56+
recompile that item for sure. But we need the dep tracking map to tell
57+
us what *else* we have to recompile. Shared state is anything that is
58+
used to communicate results from one item to another.
59+
60+
### Identifying the current task
61+
62+
The dep graph always tracks a current task: this is basically the
63+
`DepNode` that the compiler is computing right now. Typically it would
64+
be a procedure node, but it can also be a data node (as noted above,
65+
the two are kind of equivalent).
66+
67+
You set the current task by calling `dep_graph.in_task(node)`. For example:
68+
69+
```rust
70+
let _task = tcx.dep_graph.in_task(DepNode::Privacy);
71+
```
72+
73+
Now all the code until `_task` goes out of scope will be considered
74+
part of the "privacy task".
75+
76+
The tasks are maintained in a stack, so it is perfectly fine to nest
77+
one task within another. Because pushing a task is considered to be
78+
computing a value, when you nest a task `N2` inside of a task `N1`, we
79+
automatically add an edge `N2 -> N1` (since `N1` presumably needed the
80+
result of `N2` to complete):
81+
82+
```rust
83+
let _n1 = tcx.dep_graph.in_task(DepNode::N1);
84+
let _n2 = tcx.dep_graph.in_task(DepNode::N2);
85+
// this will result in an edge N1 -> n2
86+
```
87+
88+
### Ignore tasks
89+
90+
Although it is rarely needed, you can also push a special "ignore"
91+
task:
92+
93+
```rust
94+
let _ignore = tc.dep_graph.in_ignore();
95+
```
96+
97+
This will cause all read/write edges to be ignored until it goes out
98+
of scope or until something else is pushed. For example, we could
99+
suppress the edge between nested tasks like so:
100+
101+
```rust
102+
let _n1 = tcx.dep_graph.in_task(DepNode::N1);
103+
let _ignore = tcx.dep_graph.in_ignore();
104+
let _n2 = tcx.dep_graph.in_task(DepNode::N2);
105+
// now no edge is added
106+
```
107+
108+
### Tracking reads and writes
109+
110+
We need to identify what shared state is read/written by the current
111+
task as it executes. The most fundamental way of doing that is to invoke
112+
the `read` and `write` methods on `DepGraph`:
113+
114+
```rust
115+
// Adds an edge from DepNode::Hir(some_def_id) to the current task
116+
tcx.dep_graph.read(DepNode::Hir(some_def_id))
117+
118+
// Adds an edge from the current task to DepNode::ItemSignature(some_def_id)
119+
tcx.dep_graph.write(DepNode::ItemSignature(some_def_id))
120+
```
121+
122+
However, you should rarely need to invoke those methods directly.
123+
Instead, the idea is to *encapsulate* shared state into some API that
124+
will invoke `read` and `write` automatically. The most common way to
125+
do this is to use a `DepTrackingMap`, described in the next section,
126+
but any sort of abstraction barrier will do. In general, the strategy
127+
is that getting access to information implicitly adds an appropriate
128+
`read`. So, for example, when you use the
129+
`dep_graph::visit_all_items_in_krate` helper method, it will visit
130+
each item `X`, start a task `Foo(X)` for that item, and automatically
131+
add an edge `Hir(X) -> Foo(X)`. This edge is added because the code is
132+
being given access to the HIR node for `X`, and hence it is expected
133+
to read from it. Similarly, reading from the `tcache` map for item `X`
134+
(which is a `DepTrackingMap`, described below) automatically invokes
135+
`dep_graph.read(ItemSignature(X))`.
136+
137+
To make this strategy work, a certain amount of indirection is
138+
required. For example, modules in the HIR do not have direct pointers
139+
to the items that they contain. Rather, they contain node-ids -- one
140+
can then ask the HIR map for the item with a given node-id. This gives
141+
us an opportunity to add an appropriate read edge.
142+
143+
#### Explicit calls to read and write when starting a new subtask
144+
145+
One time when you *may* need to call `read` and `write` directly is
146+
when you push a new task onto the stack, either by calling `in_task`
147+
as shown above or indirectly, such as with the `memoize` pattern
148+
described below. In that case, any data that the task has access to
149+
from the surrounding environment must be explicitly "read". For
150+
example, in `librustc_typeck`, the collection code visits all items
151+
and, among other things, starts a subtask producing its signature
152+
(what follows is simplified pseudocode, of course):
153+
154+
```rust
155+
fn visit_item(item: &hir::Item) {
156+
// Here, current subtask is "Collect(X)", and an edge Hir(X) -> Collect(X)
157+
// has automatically been added by `visit_all_items_in_krate`.
158+
let sig = signature_of_item(item);
159+
}
160+
161+
fn signature_of_item(item: &hir::Item) {
162+
let def_id = tcx.map.local_def_id(item.id);
163+
let task = tcx.dep_graph.in_task(DepNode::ItemSignature(def_id));
164+
tcx.dep_graph.read(DepNode::Hir(def_id)); // <-- the interesting line
165+
...
166+
}
167+
```
168+
169+
Here you can see that, in `signature_of_item`, we started a subtask
170+
corresponding to producing the `ItemSignature`. This subtask will read from
171+
`item` -- but it gained access to `item` implicitly. This means that if it just
172+
reads from `item`, there would be missing edges in the graph:
173+
174+
Hir(X) --+ // added by the explicit call to `read`
175+
| |
176+
| +---> ItemSignature(X) -> Collect(X)
177+
| ^
178+
| |
179+
+---------------------------------+ // added by `visit_all_items_in_krate`
180+
181+
In particular, the edge from `Hir(X)` to `ItemSignature(X)` is only
182+
present because we called `read` ourselves when entering the `ItemSignature(X)`
183+
task.
184+
185+
So, the rule of thumb: when entering a new task yourself, register
186+
reads on any shared state that you inherit. (This actually comes up
187+
fairly infrequently though: the main place you need caution is around
188+
memoization.)
189+
190+
#### Dependency tracking map
191+
192+
`DepTrackingMap` is a particularly convenient way to correctly store
193+
shared state. A `DepTrackingMap` is a special hashmap that will add
194+
edges automatically when `get` and `insert` are called. The idea is
195+
that, when you get/insert a value for the key `K`, we will add an edge
196+
from/to the node `DepNode::Variant(K)` (for some variant specific to
197+
the map).
198+
199+
Each `DepTrackingMap` is parameterized by a special type `M` that
200+
implements `DepTrackingMapConfig`; this trait defines the key and value
201+
types of the map, and also defines a fn for converting from the key to
202+
a `DepNode` label. You don't usually have to muck about with this by
203+
hand, there is a macro for creating it. You can see the complete set
204+
of `DepTrackingMap` definitions in `librustc/middle/ty/maps.rs`.
205+
206+
As an example, let's look at the `adt_defs` map. The `adt_defs` map
207+
maps from the def-id of a struct/enum to its `AdtDef`. It is defined
208+
using this macro:
209+
210+
```rust
211+
dep_map_ty! { AdtDefs: ItemSignature(DefId) -> ty::AdtDefMaster<'tcx> }
212+
// ~~~~~~~ ~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
213+
// | | Key type Value type
214+
// | DepNode variant
215+
// Name of map id type
216+
```
217+
218+
this indicates that a map id type `AdtDefs` will be created. The key
219+
of the map will be a `DefId` and value will be
220+
`ty::AdtDefMaster<'tcx>`. The `DepNode` will be created by
221+
`DepNode::ItemSignature(K)` for a given key.
222+
223+
Once that is done, you can just use the `DepTrackingMap` like any
224+
other map:
225+
226+
```rust
227+
let mut map: DepTrackingMap<M> = DepTrackingMap::new(dep_graph);
228+
map.insert(key, value); // registers dep_graph.write
229+
map.get(key; // registers dep_graph.read
230+
```
231+
232+
#### Memoization
233+
234+
One particularly interesting case is memoization. If you have some
235+
shared state that you compute in a memoized fashion, the correct thing
236+
to do is to define a `RefCell<DepTrackingMap>` for it and use the
237+
`memoize` helper:
238+
239+
```rust
240+
map.memoize(key, || /* compute value */)
241+
```
242+
243+
This will create a graph that looks like
244+
245+
... -> MapVariant(key) -> CurrentTask
246+
247+
where `MapVariant` is the `DepNode` variant that the map is associated with,
248+
and `...` are whatever edges the `/* compute value */` closure creates.
249+
250+
In particular, using the memoize helper is much better than writing
251+
the obvious code yourself:
252+
253+
```
254+
if let Some(result) = map.get(key) {
255+
return result;
256+
}
257+
let value = /* compute value */;
258+
map.insert(key, value);
259+
```
260+
261+
If you write that code manually, the dependency graph you get will
262+
include artificial edges that are not necessary. For example, imagine that
263+
two tasks, A and B, both invoke the manual memoization code, but A happens
264+
to go first. The resulting graph will be:
265+
266+
... -> A -> MapVariant(key) -> B
267+
~~~~~~~~~~~~~~~~~~~~~~~~~~~ // caused by A writing to MapVariant(key)
268+
~~~~~~~~~~~~~~~~~~~~ // caused by B reading from MapVariant(key)
269+
270+
This graph is not *wrong*, but it encodes a path from A to B that
271+
should not exist. In contrast, using the memoized helper, you get:
272+
273+
... -> MapVariant(key) -> A
274+
|
275+
+----------> B
276+
277+
which is much cleaner.
278+
279+
**Be aware though that the closure is executed with `MapVariant(key)`
280+
pushed onto the stack as the current task!** That means that you must
281+
add explicit `read` calls for any shared state that it accesses
282+
implicitly from its environment. See the section on "explicit calls to
283+
read and write when starting a new subtask" above for more details.
284+
285+
### How to decide where to introduce a new task
286+
287+
Certainly, you need at least one task on the stack: any attempt to
288+
`read` or `write` shared state will panic if there is no current
289+
task. But where does it make sense to introduce subtasks? The basic
290+
rule is that a subtask makes sense for any discrete unit of work you
291+
may want to skip in the future. Adding a subtask separates out the
292+
reads/writes from *that particular subtask* versus the larger
293+
context. An example: you might have a 'meta' task for all of borrow
294+
checking, and then subtasks for borrow checking individual fns. (Seen
295+
in this light, memoized computations are just a special case where we
296+
may want to avoid redoing the work even within the context of one
297+
compilation.)
298+
299+
The other case where you might want a subtask is to help with refining
300+
the reads/writes for some later bit of work that needs to be memoized.
301+
For example, we create a subtask for type-checking the body of each
302+
fn. However, in the initial version of incr. comp. at least, we do
303+
not expect to actually *SKIP* type-checking -- we only expect to skip
304+
trans. However, it's still useful to create subtasks for type-checking
305+
individual items, because, otherwise, if a fn sig changes, we won't
306+
know which callers are affected -- in fact, because the graph would be
307+
so coarse, we'd just have to retrans everything, since we can't
308+
distinguish which fns used which fn sigs.
309+
310+
### Testing the dependency graph
311+
312+
There are various ways to write tests against the dependency graph.
313+
The simplest mechanism are the
314+
`#[rustc_if_this_changed]` and `#[rustc_then_this_would_need]`
315+
annotations. These are used in compile-fail tests to test whether the
316+
expected set of paths exist in the dependency graph. As an example,
317+
see `src/test/compile-fail/dep-graph-caller-callee.rs`.
318+
319+
The idea is that you can annotate a test like:
320+
321+
```rust
322+
#[rustc_if_this_changed]
323+
fn foo() { }
324+
325+
#[rustc_then_this_would_need(TypeckItemBody)] //~ ERROR OK
326+
fn bar() { foo(); }
327+
328+
#[rustc_then_this_would_need(TypeckItemBody)] //~ ERROR no path
329+
fn baz() { }
330+
```
331+
332+
This will check whether there is a path in the dependency graph from
333+
`Hir(foo)` to `TypeckItemBody(bar)`. An error is reported for each
334+
`#[rustc_then_this_would_need]` annotation that indicates whether a
335+
path exists. `//~ ERROR` annotations can then be used to test if a
336+
path is found (as demonstrated above).
337+
338+
### Debugging the dependency graph
339+
340+
The compiler is also capable of dumping the dependency graph for your
341+
debugging pleasure. To do so, pass the `-Z dump-dep-graph` flag. The
342+
graph will be dumped to `dep_graph.{txt,dot}` in the current
343+
directory. You can override the filename with the `RUST_DEP_GRAPH`
344+
environment variable.
345+
346+
Frequently, though, the full dep graph is quite overwhelming and not
347+
particularly helpful. Therefore, the compiler also allows you to filter
348+
the graph. You can filter in three ways:
349+
350+
1. All edges originating in a particular set of nodes (usually a single node).
351+
2. All edges reaching a particular set of nodes.
352+
3. All edges that lie between given start and end nodes.
353+
354+
To filter, use the `RUST_DEP_GRAPH_FILTER` environment variable, which should
355+
look like one of the following:
356+
357+
```
358+
source_filter // nodes originating from source_filter
359+
-> target_filter // nodes that can reach target_filter
360+
source_filter -> target_filter // nodes in between source_filter and target_filter
361+
```
362+
363+
`source_filter` and `target_filter` are a `&`-separated list of strings.
364+
A node is considered to match a filter if all of those strings appear in its
365+
label. So, for example:
366+
367+
```
368+
RUST_DEP_GRAPH_FILTER='-> TypeckItemBody'
369+
```
370+
371+
would select the predecessors of all `TypeckItemBody` nodes. Usually though you
372+
want the `TypeckItemBody` node for some particular fn, so you might write:
373+
374+
```
375+
RUST_DEP_GRAPH_FILTER='-> TypeckItemBody & bar'
376+
```
377+
378+
This will select only the `TypeckItemBody` nodes for fns with `bar` in their name.
379+
380+
Perhaps you are finding that when you change `foo` you need to re-type-check `bar`,
381+
but you don't think you should have to. In that case, you might do:
382+
383+
```
384+
RUST_DEP_GRAPH_FILTER='Hir&foo -> TypeckItemBody & bar'
385+
```
386+
387+
This will dump out all the nodes that lead from `Hir(foo)` to
388+
`TypeckItemBody(bar)`, from which you can (hopefully) see the source
389+
of the erroneous edge.
390+

0 commit comments

Comments
 (0)