Invalidate outputs based on better than timestamp #371

natebosch · 2017-08-17T21:55:51Z

Today a changed file can cascade to any output which may end up rebuilding way more than is optimal. If we track some type of content hash we can get away with pruning the build graph more eagerly.

jakemac53 · 2017-10-26T19:57:18Z

This will also help for correctness, so I am going to go ahead and tackle it.

See #536 for an example.

jakemac53 · 2017-10-26T19:58:08Z

Actually I am only going to tackle invalidating inputs, not also invaliding outputs based on whether files actually changed (at least right now).

…stamps (#547) All `AssetNode`s now have a `Digest` field (from package:crypto). These are computed eagerly, and must be available for any node which has a corresponding asset that exists. It is also serialized as a part of the asset graph (base64 encoded for json compatibility for now). There is now an additional interface `DigestAssetReader` which adds the `Future<Digest> digest(id)` method to a normal asset reader, and the `RunnerAssetReader` interface now requires that interface to be implemented. `AssetGraph.build` is now a static method which returns a `Future<AssetGraph>`, and requires a `DigestAssetReader`. It eagerly reads in digests for all source assets. MD5 is the current hash function used, because collisions are not a concern in this case (we don't use the hashes for lookups, only to compare new/old values for a single file). Bazel also uses md5 today, as well as the analyzer for summaries, so this is also consistent with those (although they may change it is still kind of nice). It is also significantly faster than sha1 in practice. This does complicate some of the tests a bit unfortunately (search this pr for `computeHash` in tests....), but its a necessary evil imo. I don't see a way to simplify it significantly. I didn't yet completely remove the `lastModified` method, it is still used when checking if a build script itself has been updated. I wanted to keep the scope of this as small as I could, so I will follow up with a different pull request to update that as well, and remove `lastModified` entirely. Partially fixes #536 Unblocks #371 and #412

Highlights are as follows: - Fixed some behavior around replacing `SyntheticAssetNode`s with real ones. Previously we would remove all the outputs from the graph and clean up their inputs sets which wasn't correct (we want to retain all those edges and nodes, and just swap out the node in question). - When invalidating nodes, we no longer pre-emptively delete them. In fact, I never explicitly delete them in this cl I just rely on the asset writer to overwrite them (unless their primary input is deleted). - The inputs set for `GeneratedAssetNode`s is now ordered to ensure the combined md5 hash will always be the same for the same inputs. Fixes #371

natebosch added this to the build_runner + DDC milestone Aug 17, 2017

natebosch added the package:build_runner label Aug 30, 2017

matanlurey mentioned this issue Sep 26, 2017

Figure out a way to test that builds don't happen #424

Open

jakemac53 mentioned this issue Oct 12, 2017

If a generated asset didn't change, don't invalidate things that depend on it #91

Closed

jakemac53 self-assigned this Oct 26, 2017

jakemac53 removed their assignment Oct 26, 2017

jakemac53 mentioned this issue Oct 27, 2017

Use content hashes to invalidate assets instead of last modified timestamps #547

Merged

jakemac53 mentioned this issue Nov 7, 2017

Track inputs for GeneratedAssetNodes #584

Merged

jakemac53 self-assigned this Nov 10, 2017

jakemac53 mentioned this issue Nov 14, 2017

Use a combined inputs hash to check if we need to run a build #606

Merged

jakemac53 closed this as completed in #606 Nov 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalidate outputs based on better than timestamp #371

Invalidate outputs based on better than timestamp #371

natebosch commented Aug 17, 2017

jakemac53 commented Oct 26, 2017

jakemac53 commented Oct 26, 2017

Invalidate outputs based on better than timestamp #371

Invalidate outputs based on better than timestamp #371

Comments

natebosch commented Aug 17, 2017

jakemac53 commented Oct 26, 2017

jakemac53 commented Oct 26, 2017