Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPLAT-5152: Optimize builder's use of asset reads to avoid unnecessary rebuilds #280

Merged
merged 7 commits into from
Apr 4, 2019

Conversation

evanweible-wf
Copy link
Contributor

@evanweible-wf evanweible-wf commented Apr 1, 2019

CPLAT-5152

Issue Status

Motivation

We've seen unexpectedly large rebuild times in codebases that consume this package and its builder. After some digging, it has been determined that the main reason for this is due to how the over_react builder is currently implemented:

The PartBuilder util from the source_gen package always retrieves the LibraryElement for each file that is a library (i.e. not a part file). This requires that the analysis session resolves the entire element tree for that file, which is expensive but more importantly results in reads on all transitively imported files. These reads are then stored in the asset graph as inputs for that build step, which consequently means that the over_react builder will be re-run on many files unnecessarily because the inputs have changed.

Changes

The over_react builder only needs the AST for each file, so the solution here is to replace our usage of PartBuilder with a custom Builder that manually reads the library source for each BuildStep and calls the analyzer API parseCompilationUnit() to get the info needed to generate the .over_react.g.dart part files.

Perf Improvement

Some numbers to demonstrate the performance improvement delivered by this change:

Average clean build time:

Averaged over 5 runs.
pub run build_runner clean && pub run build_runner build

Project Before After
web_skin_dart 122s 107s
wdesk_sdk 112s 45s

Average re-build time:

Averaged over 5 runs. Rebuilt after changing one entry point file in lib/
that has a large number of imports/exports.

Change file -> pub run build_runner build

Project File Before After
web_skin_dart lib/ui_core.dart 42.9s 21.22s
wdesk_sdk lib/app_infrastructure.dart 63s 2.2s

Release Notes

Optimize the over_react builder to avoid unnecessary asset reads. Informal testing in a large codebase that consumes over_react has shown in the worst case a speed up on hot-rebuilds from ~1 minute to ~2 seconds.

Review

See CONTRIBUTING.md for more details on review types (+1 / QA +1 / +10) and code review process.

Please review:

QA Checklist

  • Tests were updated and provide good coverage of the changeset and other affected code
  • Manual testing was performed if needed

Merge Checklist

While we perform many automated checks before auto-merging, some manual checks are needed:

  • A Client Platform member has reviewed these changes
  • There are no unaddressed comments - this check can be automated if reviewers use the "Request Changes" feature
  • For release PRs - Version metadata in Rosie comment is correct

The PartBuilder util from the source_gen package always retrieves the
LibraryElement for each file that is a library (i.e. not a part file).
This requires that the analysis session resolves the entire element tree
for that file, which is expensive but more importantly results in reads
on all transitively imported files. These reads are then stored in the
asset graph as inputs for that build step, which consequently means that
the over_react builder will be re-run on many files unnecessarily because
the inputs have changed.

The over_react builder only needs the AST for each file, so the solution
here is to replace our usage of PartBuilder with a custom Builder that
manually reads the library source for each BuildStep and calls the
analyzer API `parseCompilationUnit()` to get the info needed to generate
the `.over_react.g.dart` part files.

In preliminary testing, we've seen this approach reduce the rebuild time
from ~1 minute to ~7 seconds when changing a single file in a large
library that has many transitive imports.
The build_web_compilers|entrypoint builder does not need to
look at any files in lib/. For the test/ directory it
only needs to consider the .browser_test.dart extension as
entry points - doing so prevents DDC/dart2js calls on the
source _test.dart files as well as the non-browser test
bootstrap files (.node_test.dart and .vm_test.dart).
@aviary3-wk
Copy link

Security Insights

No security relevant content was detected by automated scans.

Action Items

  • Review PR for security impact; comment "security review required" if needed or unsure
  • Verify aviary.yaml coverage of security relevant code

Questions or Comments? Reach out on Slack: #support-infosec.

@rmconsole5-wk rmconsole5-wk changed the title Optimize builder's use of asset reads to avoid unnecessary rebuilds CPLAT-5152 Optimize builder's use of asset reads to avoid unnecessary rebuilds Apr 1, 2019
@charliekump-wf charliekump-wf changed the title CPLAT-5152 Optimize builder's use of asset reads to avoid unnecessary rebuilds CPLAT-5152: Optimize builder's use of asset reads to avoid unnecessary rebuilds Apr 1, 2019
Copy link
Contributor

@corwinsheahan-wf corwinsheahan-wf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool. I had a couple of comments/questions.

final parts = libraryUnit.directives
.whereType<PartDirective>()
// Ignore `.over_react.g.dart` parts - that's what we're generating.
.where((part) => !part.uri.stringValue.endsWith(outputExtension));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to ignore parts from other generated files. That was an issue with the old approach, did you verify that it's not an issue here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see any issues, but this reminds me that I meant to add an another check for each part directive so that we only parse the compilation units if the part file contains over react declarations. Currently this is only doing that check on the parent library file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safer to ignore anything ending in .g.dart and explicitly not support over_react annotations in generated files. We could run into a situation where we're parsing a file that doesn't exist yet.

partSource,
url: idToPackageUri(partId));
CompilationUnit partUnit;
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very similar to the parsing and generating for the main library code above. What's the reason not to DRY this up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the main difference is that I have to parse the compilation unit for the library earlier so that I can check the directives for a PartOfDirective (i.e. to see if the file is actually a part file) so that we can bail early. I can probably dry it up though

..writeln('part of $partOf;')
..writeln()
..writeln(_headerLine)
..writeln('// OverReactGenerator')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit Personally, I'd advocate for not using the same header as the generator. It could lead to confusion if a consumer is examining our outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I was going for minimal changes to generated code but I guess that doesn't really matter since we're building to cache.

Copy link
Contributor

@greglittlefield-wf greglittlefield-wf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Have we let the Dart team know about this performance issue?

@evanweible-wf
Copy link
Contributor Author

evanweible-wf commented Apr 2, 2019

@greglittlefield-wf @corwinsheahan-wf Just pushed a commit to tidy up the builder implementation a bit.

Have we let the Dart team know about this performance issue?

Not yet, I want to do a few more tests with a simpler builder and simpler codebase that still uses buildStep.inputLibrary and then I plan on asking in their gitter chat about that cost in general. Will probably end up opening an issue in the build project around documenting those potential pitfalls as well as an issue in source_gen since there is currently no way to opt out of that behavior with their utils.

- DRY up logic around parsing the compilation unit, the over_react declarations,
  and generating the over_react implementation code.
- Ignore all .g.dart part files.
- Only parse/generate for files that have over_react annotations.
@evanweible-wf
Copy link
Contributor Author

evanweible-wf commented Apr 4, 2019

Some numbers to demonstrate the performance improvement delivered by this change:

Average clean build time:

Averaged over 5 runs.
pub run build_runner clean && pub run build_runner build

Project Before After
web_skin_dart 122s 107s
wdesk_sdk 112s 45s

Average re-build time:

Averaged over 5 runs. Rebuilt after changing one entry point file in lib/
that has a large number of imports/exports.

Change file -> pub run build_runner build

Project File Before After
web_skin_dart lib/ui_core.dart 42.9s 21.22s
wdesk_sdk lib/app_infrastructure.dart 63s 2.2s

@@ -21,10 +21,6 @@ class OverReactBuilder extends Builder {
@override
FutureOr<void> build(BuildStep buildStep) async {
final source = await buildStep.readAsString(buildStep.inputId);
if (!_mightContainDeclarations(source)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? I would imagine performing this regex match is faster than parsing the compilation unit. Or, is that not the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is outdated, but still applies to the new version of the file

Copy link
Contributor Author

@evanweible-wf evanweible-wf Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, we are no longer providing a warning in the scenario where a consumer has over_react-annotated code but did not include the .over_react.g.dart part file (which is something that the PartBuilder from source_gen did for us).

I didn't post these numbers but I did run a comparison between this commit and the previous commit and there was no noticeable difference in clean build or rebuild times on WSD or wdesk_sdk. I was a bit surprised by that, but my best guess is that the file reads and the AST parsing are cached.. or maybe they're just both fast? I'm not totally sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit surprising to me, too. I know file reads are pretty fast, but I expected AST parsing to not be super cheap. 🤷‍♂️

I'm fine with it as-is, then!

@evanweible-wf
Copy link
Contributor Author

QA +1

  • CI passes
  • Consumer tested with WSD and wdesk_sdk

@evanweible-wf
Copy link
Contributor Author

@Workiva/release-management-p

@rmconsole6-wk rmconsole6-wk merged commit 02ea76b into master Apr 4, 2019
@rmconsole6-wk rmconsole6-wk deleted the optimize_builder branch April 4, 2019 20:45
aaronlademann-wf added a commit that referenced this pull request May 1, 2019
+ Without the use of fully-resolved AST since we no longer have access to it for build perf reasons (see: #280)
aaronlademann-wf added a commit that referenced this pull request May 2, 2019
+ Without the use of fully-resolved AST since we no longer have access to it for build perf reasons (see: #280)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants