Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gatsby): don't block event loop during inference #37780

Merged
merged 1 commit into from
Mar 27, 2023

Conversation

TylerBarnes
Copy link
Contributor

@TylerBarnes TylerBarnes commented Mar 24, 2023

These changes drop gatsbyjs.com schema building time from 16s to 7s.
For a Contentful site with 4.9M nodes it drops schema building time from 520s to 380s. It also allows the 4.9M node site to build with significantly less memory. Previously it needed 64Gi of memory, now it can build with 24Gi with these changes (+ some other changes I'll be PRing soon).

Vlad previously added code in https://github.com/gatsbyjs/gatsby/pull/19781/files#diff-d380fd3fbf5adf3933e07c737228eb75e520cdc7a5050d4d6b710acd5256d40cR48 that lets the event loop breathe between inferring each node type which was good.
The problem with that though is it means for each node type, every node of that type will be loaded into memory before node can have an opportunity to garbage collect if it needs to. Enough memory is needed to load all of the type of nodes which has the greatest count into memory.
For large sites this means huge amounts of memory is required. For small/medium sites it still means more memory is needed than should be needed but also inferring is slower than it could be because the event loop is blocked for larger chunks of time.
With this code node can throw away the last inferred nodes if it needs to before moving on to the next chunk of 1000.

Apparently v8 can gc whenever it wants to, but from what I saw here that was not the case. Memory linearly grew until each type was finished inferring. With these changes memory usage stays relatively flat during inference. Possibly it's because not blocking the event loop allows other parallel code to complete, allowing node to GC elsewhere.

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Mar 24, 2023
@TylerBarnes TylerBarnes removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Mar 24, 2023
@TylerBarnes TylerBarnes marked this pull request as ready for review March 25, 2023 00:38
@TylerBarnes TylerBarnes added topic: GraphQL Related to Gatsby's GraphQL layer topic: data Relates to source-nodes, internal-data-bridge, and node creation labels Mar 25, 2023
@TylerBarnes TylerBarnes merged commit c08048d into master Mar 27, 2023
@TylerBarnes TylerBarnes deleted the fix/inference-event-loop branch March 27, 2023 16:55
pieh pushed a commit that referenced this pull request Mar 29, 2023
Don't block the event loop during inference

(cherry picked from commit c08048d)
pieh pushed a commit that referenced this pull request Mar 29, 2023
Don't block the event loop during inference

(cherry picked from commit c08048d)
pieh pushed a commit that referenced this pull request Mar 29, 2023
Don't block the event loop during inference

(cherry picked from commit c08048d)

Co-authored-by: Tyler Barnes <tylerdbarnes@gmail.com>
pieh added a commit that referenced this pull request Mar 29, 2023
Don't block the event loop during inference

(cherry picked from commit c08048d)

Co-authored-by: Tyler Barnes <tylerdbarnes@gmail.com>
Co-authored-by: Michal Piechowiak <misiek.piechowiak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: data Relates to source-nodes, internal-data-bridge, and node creation topic: GraphQL Related to Gatsby's GraphQL layer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants