-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(gatsby): limit node manifest creation limit #35359
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough context but I fear that this might break in subtle ways or I might not understand the feature well enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! 🙌 Just a couple nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! This is out of the scope of this PR but maybe we need a way to clear out old manifests as well? As it is now we only clear them on cache clears. But maybe there's more we could be doing to clear them so they don't keep accumulating?
@DanielSLew it's a good idea but I think out of scope for this PR. We have the manifest ID as part of each page so perhaps when a page is deleted we could delete the manifest 🤔 |
Just updated the tests! I realized that they would fail if running all of the integration tests for this (I was only running the one new test while adding new tests and didn't check to make sure that everything would still pass if the whole test file got ran) |
6aa1358
to
2550939
Compare
I published a canary with these changes:
|
Description
Context
Node Manifest files are used to tie a unique revision state of a piece of content in a data source to a gatsby page generated from that exact data. In some cases, this is causing upwards of 150K plus node manifest files to get written to disk which can cause performance degradation in certain cases.
How is it done?
This helps PR introduces a limit to the number of node manifest files written to disk during a gatsby build which can be controlled with the env var
NODE_MANIFEST_FILE_LIMIT
.If the number of node manifests that are queued to get written to disk is greater than
NODE_MANIFEST_FILE_LIMIT
, then the node manifest "candidates" are sorted by theirupdatedAtUTC
field timestamp. That sorted list of node manifests is used to queue up file writes file writes are then stopped as soon as theNODE_MANIFEST_FILE_LIMIT
has been reachedNode Manifest Sorting
I spent some time testing some different sorting approaches and it turns out that the native array sort is almost certainly adequate.
Sorting 1,000,000 node manifests took ~11156 ms while sorting 100,000 took ~914 ms. These numbers seem well within reason in context of a Gatsby build.