fix(gatsby): granular redux cache heuristics for finding large nodes #23643
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When persisting the redux store, there is a heuristic that will try to roughly determine how big nodes are because there are intrinsic limits to the amount of data we can store in one pass (2gb). This heuristic is currently taking 11 random nodes from the entire pool and using the biggest one to determine the chunk size.
This PR improves the granularity of this check to do this per node type, because certain nodes are intrinsically smaller than others so it hopefully yields more relevant information. I suspect that, ultimately, we could skip all of them and just check the Page type, but we can determine that later.
One downside is that this means sites with many types, like Contentful sites, will probe 11*x nodes now. I don't expect this to be a huge problem, but something noteworthy regardless.
Fixes #23627
This fix will not work with Loki, since that doesn't create the by-type map. Since I'm planning to remove that in the next two weeks I'm probably going to wait with actually getting this PR merged until that happens.