Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a combined inputs hash to check if we need to run a build #606

Merged
merged 15 commits into from
Nov 14, 2017

Conversation

jakemac53
Copy link
Contributor

This was more nuanced than I expected, and I ended up needing to clean up a few other things in the process. Highlights are as follows:

  • Changed summaries to semantic summaries (we were actually relying on the full ones before, whoops!)
  • Fixed some behavior around replacing SyntheticAssetNodes with real ones. Previously we would remove all the outputs from the graph and clean up their inputs sets which wasn't correct (we want to retain all those edges and nodes, and just swap out the node in question).
  • When invalidating nodes, we no longer pre-emptively delete them. In fact, I never explicitly delete them in this cl I just rely on the asset writer to overwrite them (unless their primary input is deleted).
  • The inputs set for GeneratedAssetNodes is now ordered to ensure the combined md5 hash will always be the same for the same inputs.

Fixes #371

@jakemac53 jakemac53 added the type-enhancement A request for a change that isn't a bug label Nov 14, 2017
@jakemac53 jakemac53 added this to the M0: Replace `pub serve` milestone Nov 14, 2017
@jakemac53 jakemac53 requested a review from natebosch November 14, 2017 17:22
@googlebot googlebot added the cla: yes Google is happy with the PR contributors label Nov 14, 2017
@@ -76,7 +76,7 @@ Future createUnlinkedSummary(Module module, BuildStep buildStep,
request.arguments.addAll([
'--build-summary-only',
'--build-summary-only-unlinked',
'--build-summary-output=${summaryOutputFile.path}',
'--build-summary-output-semantic=${summaryOutputFile.path}',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least a few of these changes feel like they could be justified in isolation and should be relatively easy to move to a new PR.

Can we submit this one separately?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get a speedup w/ this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will break this out into its own cl and test in isolation, its mostly about invalidating later build steps but it should also improve the time to read and write summaries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#609, no noticeable build speedup from this alone unfortunately.

@@ -66,7 +66,9 @@ class AssetGraph {
var existing = get(node.id);
if (existing != null) {
if (existing is SyntheticAssetNode) {
_remove(existing.id);
// Don't call _remove, that transitively removes primary outputs. We
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename to _removeSubtree or something like that to make this clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to _removeRecursive.


/// A digest combining all digests of all previous inputs.
///
/// This is used to determine if a node really needs to be output or not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] I might rephrase this, the Builder still decides whether to output an asset, we decide whether to run the build.

"Used to determine whether all the inputs to a build step are identical to the previous run indicating that the previous output is still valid."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (!await _buildShouldRun(builderOutputs, wrappedReader)) {
return <AssetId>[];
}
// We may have read some inputs in the call to `_buildShouldRun`, we want
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not be going through the wrapped reader if we don't want to track this?

Or is that the only way for the lazy assets to work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, we do this because of the lazy assets and also the canRead implementation which takes into account phase numbers.

@@ -443,6 +515,10 @@ class BuildImpl {
assert(inputNode != null, 'Asset Graph is missing $input');
inputNode.outputs.add(output);
}

// And finally compute the combined digest for all inputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these comments are "what" with no "why" - should we be refactoring to make the code easier to understand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broke out some parts of this into appropriately named functions so its easier to follow

Copy link
Member

@natebosch natebosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline - the fact that this change requires touching so many places, especially in subtle ways, points to it not having a cohesive implementation.

I think in the future there will be room to shift more of the logic explicitly into the AssetGraph or a parallel class rather that sit across both the AssetGraph and build_impl.

I think we can get it in as is and refactor later. Looks like test coverage is good enough to have confidence in it.

@jakemac53 jakemac53 merged commit 94d034b into master Nov 14, 2017
@jakemac53 jakemac53 deleted the inputs-hashes branch November 14, 2017 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Google is happy with the PR contributors type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants