Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce use of generators in finding files to document #3961

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

srawlins
Copy link
Member

@srawlins srawlins commented Jan 9, 2025

The short story of this change is that we're refactoring the "file-finding" code so that _findFilesToDocumentInPackage is no longer async*, and _listDir is no longer sync*. Lazy generators like these are nice when the full sequence of elements may not be generated. However, in this code, we definitely iterate over every element returned by these two functions, and sync* is known to be slow, and a package set may be thousands of files.

The longer story has more details:

  • findFilesToDocumentInPackage was async because of the code that finds a package config, and that code is only needed when the includeDependencies parameter is true. The function only has two callers, one of which always passed includeDependencies: false.
    • So we can extract out the code that looks for a package config, and iterates over the listed packages, into a new function, _findPackagesToDocument, only called by _getFilesToDocument. Now _findFilesToDocumentInPackage doesn't need to be async, or a generator.
  • Further efficiencies are found in getFilesToDocument: This function can actually be split into two cases: are we documenting the SDK, or not?
    • If documenting the SDK, we don't need to consider the auto-include-dependencies flag, as it is never used when documenting the SDK. We also don't need to consider the deprecated include-externals flag, as it is also never used when documenting the SDK. So we return a very simple calculation in this case. When documenting the SDK, the code is now simpler and does less work.
    • We can then isolate all of the complexity in the "not documenting the SDK" case, taking into account these two flags, and the embedder SDK files.
  • Now that _discoverLibraries is only called once, we don't need to pass in the addLibrary callback. Instead, _discoverLibraries can directly call uninitializedPackageGraph.addLibraryToGraph.

I benchmarked this both documenting the SDK and googleapis package. There was no visible difference in time-to-document or memory usage when documenting the SDK. For the googleapis package, there was a -1% time-to-document, and -0.5% memory usage, which might well be within the error margins. 🤷


  • I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:

Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.

@srawlins srawlins force-pushed the simplify-find-files branch from 3e3086f to 70a811a Compare January 10, 2025 03:55
@srawlins
Copy link
Member Author

CC @dart-lang/analyzer-team

@bwilkerson I was looking at this code while reminding myself how dartdoc looks for files to document. I didn't get far before I decided to simplify this code. :/

@srawlins
Copy link
Member Author

This is still ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant