Optimize dep-graph serialization and improve meta-data hashing #35406

nikomatsakis · 2016-08-06T00:36:36Z

This makes a number of related changes:

instead of serializing the whole graph, we only serialize HIR -> WorkProduct edges
we gather predecessors more efficiently and cache effort, drastically cutting serialization time by over 6x in my measurements
compute the metadata hashes more determinstically
improve -Z incremental-info, which tells you how much re-use you are getting, and tells you why you are not getting re-use

The end result is that, at least in my tests, we are now able to get 100% re-use when doing cargo clean; cargo build twice in a row (with an incremental directory, of course). Here are some rough performance numbers from the syntex-syntax crate:

build	normal time (debug)	incremental time (debug)
full build	128s	78s
just syntex_syntax	96s	51s

Note that these numbers only reflect the first phase of incremental, where we are just skipping the trans/LLVM phases. They will get better as we incrementalize more of the compiler. Also, these numbers represent the simplest case, where no actual changes were made. There is still more work to do to handle the case some small number of changes were made correctly (in particular, verifying that the dep-graph is complete etc).

r? @michaelwoerister

Fixes #35167.

The way we do HIR inlining introduces reads of the "Hir" into the graph, but this Hir in fact belongs to other crates, so when we try to load later, we ICE because the Hir nodes in question don't belond to the crate (and we haven't done inlining yet). This pass rewrites those HIR nodes to the metadata from which the inlined HIR was loaded.

The biggest problem, actually, is krate numbers being removed entirely, which can lead to array-index-out-of-bounds errors. cc rust-lang#35123 -- not a complete fix, since really we ought to "map" the old crate numbers to the new ones, not just detect changes.

The reads will occur naturally as the HIR/MIR is fetched from the tracked tables, and this winds up adding reads to the hir of foreign def-ids somehow.

When we hash the inputs to a MetaData node, we have to hash them in a consistent order. We achieve this by sorting the stringfied `DefPath` entries. Also, micro-optimie by cache more results across the saving process.

It's nice to get a rough idea of how much work we're saving.

We now detect inlined id's earlier (in the HIR map) and rewrite a read of them to be a read of the metadata for the associated item.

I cannot figure out how to write a test for this, but I observed incorrect edges as a result of not using memoized pattern here (e.g., LateLintCheck -> SizedConstraint).

this can actually be expensive!

michaelwoerister · 2016-08-06T14:25:39Z

~~Are these numbers for a release or for a debug build?~~
Nevermind :)

michaelwoerister · 2016-08-08T13:08:29Z

src/librustc_incremental/persist/hash.rs

@@ -39,6 +39,14 @@ impl<'a, 'tcx> HashContext<'a, 'tcx> {
        }
    }

+    pub fn is_hashable(dep_node: &DepNode<DefId>) -> bool {


I'd prefer if this was called is_input_node or is_graph_root to make it clearer what we are after.

michaelwoerister · 2016-08-08T16:00:52Z

Ok, I've reviewed everything after rustfmt save.rs. The question in encode_dep_graph seems important.

Per the discussion on rust-lang#34765, we make one `DepNode::Mir` variant and use it to represent both the MIR tracking map as well as passes that operate on MIR. We also track loads of cached MIR (which naturally comes from metadata). Note that the "HAIR" pass adds a read of TypeckItemBody because it uses a myriad of tables that are not individually tracked.

it now carries a def-id; supply a dummy

The new `Predecessors` type computes a set of interesting targets and their HIR predecessors, and discards everything in between.

Fixes rust-lang#35292.

Produces a deterministic hash, at least for a single platform / compiler-version.

This massively speeds up serialization. It also seems to produce deterministic metadata hashes (before I was seeing inconsistent results). Fixes rust-lang#35232.

nikomatsakis · 2016-08-09T12:29:07Z

@michaelwoerister OK, I addressed your nits. I will rebase atop of #35166, I think, and maybe try to land them together once you've revised the additional commits in there.

michaelwoerister · 2016-08-09T14:12:13Z

src/librustc_incremental/persist/load.rs

+    // encoding, rather than having been retraced to a `DefId`. The
+    // reason for this is that this way we can include nodes that have
+    // been removed (which no longer have a `DefId` in the current
+    // compilation).


michaelwoerister · 2016-08-09T14:16:08Z

LGTM.

nikomatsakis added 11 commits August 1, 2016 19:57

hash foreign items too

9294f8e

remove register_reads

2797b2a

The reads will occur naturally as the HIR/MIR is fetched from the tracked tables, and this winds up adding reads to the hir of foreign def-ids somehow.

make metadata hashes determinstic

2e7df80

When we hash the inputs to a MetaData node, we have to hash them in a consistent order. We achieve this by sorting the stringfied `DefPath` entries. Also, micro-optimie by cache more results across the saving process.

dump statistics about re-use w/ -Z time-passes

903142a

It's nice to get a rough idea of how much work we're saving.

replace graph rewriting with detecting inlined ids

94acff1

We now detect inlined id's earlier (in the HIR map) and rewrite a read of them to be a read of the metadata for the associated item.

improve log when something no longer exists

b13d504

use memoized pattern for SizedConstraint

54595ec

I cannot figure out how to write a test for this, but I observed incorrect edges as a result of not using memoized pattern here (e.g., LateLintCheck -> SizedConstraint).

skip assert-dep-graph unless unit testing

bfbfe63

this can actually be expensive!

rustfmt save.rs

a6a97a9

rust-highfive assigned michaelwoerister Aug 6, 2016

michaelwoerister reviewed Aug 8, 2016
View reviewed changes

nikomatsakis added 11 commits August 8, 2016 18:41

rename KrateInfo to CrateInfo

88b2e9a

fixup tests for new def'n of InlinedItem

82b6dc2

it now carries a def-id; supply a dummy

isolate predecessor computation

0e97240

The new `Predecessors` type computes a set of interesting targets and their HIR predecessors, and discards everything in between.

make DepNode PartialOrd

a92b1a7

replace Name with InternedString in DefPathData

571010b

Fixes rust-lang#35292.

add a -Z incremental-info flag

d4bd054

add a deterministic_hash method to DefPath

8150494

Produces a deterministic hash, at least for a single platform / compiler-version.

generalize BitMatrix to be NxM and not just NxN

9978cbc

use preds to serialize just what we need

02a4703

This massively speeds up serialization. It also seems to produce deterministic metadata hashes (before I was seeing inconsistent results). Fixes rust-lang#35232.

address comments from mw

ecbcf1b

nikomatsakis force-pushed the incr-comp-35232 branch from 698619a to ecbcf1b Compare August 9, 2016 13:12

michaelwoerister reviewed Aug 9, 2016
View reviewed changes

nikomatsakis added 2 commits August 9, 2016 10:25

pacify the mercilous tidy

76eecc7

fix license

e0b82d5

bors merged commit e0b82d5 into rust-lang:master Aug 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize dep-graph serialization and improve meta-data hashing #35406

Optimize dep-graph serialization and improve meta-data hashing #35406

Uh oh!

nikomatsakis commented Aug 6, 2016 •

edited

Loading

Uh oh!

michaelwoerister commented Aug 6, 2016 •

edited

Loading

Uh oh!

michaelwoerister Aug 8, 2016

Uh oh!

michaelwoerister commented Aug 8, 2016

Uh oh!

nikomatsakis commented Aug 9, 2016

Uh oh!

michaelwoerister Aug 9, 2016

Uh oh!

michaelwoerister commented Aug 9, 2016

Uh oh!

Uh oh!

Optimize dep-graph serialization and improve meta-data hashing #35406

Optimize dep-graph serialization and improve meta-data hashing #35406

Uh oh!

Conversation

nikomatsakis commented Aug 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelwoerister commented Aug 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelwoerister Aug 8, 2016

Choose a reason for hiding this comment

Uh oh!

michaelwoerister commented Aug 8, 2016

Uh oh!

nikomatsakis commented Aug 9, 2016

Uh oh!

michaelwoerister Aug 9, 2016

Choose a reason for hiding this comment

Uh oh!

michaelwoerister commented Aug 9, 2016

Uh oh!

Uh oh!

nikomatsakis commented Aug 6, 2016 •

edited

Loading

michaelwoerister commented Aug 6, 2016 •

edited

Loading