-
Notifications
You must be signed in to change notification settings - Fork 10.3k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
App bundle size inflated by matchPath data on 25k page site #21701
Comments
Hiya! This issue has gone quiet. Spooky quiet. 👻 We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here. Thanks for being a part of the Gatsby community! 💪💜 |
Hey again! It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it. Thanks again for being part of the Gatsby community! 💪💜 |
Hey sorry this never got responded to. Could you create a small reproduction of the problem? That'd be the best next step towards investigating the bug. |
Sorry for the late reply on this @KyleAMathews I have created a codesandbox with my approach here. The app.js file is here. As you can see there are unneccessary |
Hi @wardpeet, sorry for mentioning you directly but I think the bit of code which is causing me an issue here is from one of your PRs - #17412. If I comment out the following code, the
I understand that code is there for a reason as it was part of the fix for #16097, but I just want to get some understanding of whether the solution should be on my end or yours. Our app has a requirement to have static pages which can have client-only sub-routes, which is why we have pages created through Correct me if I'm wrong - I think that code checks all pages to see if their URL matches any of the Any help would be greatly appreciated. |
@KyleAMathews @wardpeet - as far as I can tell, we should be able to trust that Gatsby has generated all the required assets for statically generated pages (their index.html, page-data.json, etc) so the fix (#17412) for #16097 shouldn't need to include static pages in |
Let me sketch the problem here a bit. When gatsby loads on page, the next navigation will be a client side navigation. So, no page refresh. We try to load a page-data.json file for each URL. If it doesn't exist it will return a 404. When people use matchpaths, this fails and we need to know what the rootPath is. That's why matchpaths exists. If we kept doing a page-data fetch that could result in many 404s which wouldn't be ideal. We had this logic for a while but a lot of complains happened so that's why we reverted to this pattern. We're probably able to improve the algorithm a bit. |
Hi @wardpeet, also suffering from this issue. I would like to put a few select client only routes on the index (e.g. /pay), but also have plenty of static pages (e.g. /shiny-new-toy, /buy-buy-buy). Gatsby understands those static pages exist and can be routed to, but the number of the static pages has brought back the "page manifest problem": [
{
"path": "/buy-buy-buy",
"matchPath": "/buy-buy-buy"
}, //x 10's of thousands of these static page mappings
{
"path": "/",
"matchPath": "/*"
}
] With a finite amount of client only pages, could the problem not be avoided by permitting a simple mapping alongside the globs that are currently supported, i.e. //gatsby-node.js
exports.onCreatePage = async ({ page, actions }) => {
const { createPage } = actions
if (page.path === "/") {
page.matchPaths = ["/pay", "myotherclientsideroute"]
createPage(page)
}
} Which then produces [
{
"path": "/",
"matchPaths": ["/pay", "myotherclientsideroute"]
}
] And then changing |
I also bumped into this problem having 25k+ pages. As @ConorLindsay94 suggested commenting out the mentioned block of code solved it to me too. Update [2020-10-06]: No, this move has fixed my issues only partially. In the end, I rollbacked to 2. 15.22 (Sep. 2019 version) to make localizations work without having huge match-paths.json. |
@KyleAMathews @wardpeet is this something that you would be happy having opensource contributions for? Or is it something you're already working on? If we were to contribute, are there any specific areas to look out for, or has our (@ConorLindsay94) initial work highlighted the main area? |
@pieh @wardpeet we have noticed that as a result of changes in #25057 that the |
Hmmm, this PR shouldn't have such impact - what this PR should have done is just change from using |
Using default it seems like I wonder - this might be result of changes to webpack chunking setup maybe that we done some time ago? If that would be the case then Other option is that there is regression, but it's conditional - there might be some specific setup that cause this - removing |
@pieh apologies, it was a false positive (true negative for this issue, I suppose). I have run various tests and it does seem that webpack is now putting I've run tests using different
When the array gets to a certain size, webpack is creating an independent chunk for the It seems the change in webpack behavior is between 2.23.20 and 2.23.21, which is where In fact, now having multiple large |
Ok, so this seems like combination of things here. Introduction of virtual modules (at least the way they were implemented) cause |
#25720 is PR that should fix the chunk splitting change (regression?) |
@pieh what's the latest on this? This still feels very problematic for me (and others); the effect of dynamic/templated page creation results in a ballooning of matchpaths. I've suggested in another thread about possibly relinquishing control of matchPath to a function that we (consumers of gatsby) could provide. That would allow far more succint code to be generated. |
We've hit this issue with 65k pages, where the bundle size increased to 8MB. The other thing about this issue is that it basically breaks incremental builds. A change in page path, or creating/deleting new pages changes the list in app-hash.js, which will always result in a full rebuild of all pages (the list changes, so the hash in app-hash.js filename will change, which is then included in all html pages...). Is there some workaround for this other than completely getting rid of matchPath usage in a project? |
@kszot-ref curious what you're doing that results in so many matchPaths? Perhaps there's a cheaper way to accomplish the same goal. |
@KyleAMathews I basically create 65k pages through "createPage" method inside gatsby-node.js file. If I pass a matchPath to at least one of those pages during their creation, my app.js balloons up to 8MB from 300kb because it generates the matchPath array that contains every single page and places it inside the app-hash.js. In general I have only 8 matchPaths set - for some specific pages that use reach/router for dynamic views. |
Wait... the array includes every page not just the few matchPath records? What does the data look like? |
@KyleAMathews Yes, every page is in there. The array looks like this:
and it goes on to include every single page. So, basically each "path" gets assigned its own "matchPath" with the same value as "path". The few matchPaths that were set manually are in there too of course. |
I see this too. For added context, I see this when using languages.forEach(function (language) {
var localePage = generatePage(true, language);
var regexp = new RegExp("/404/?$");
if (regexp.test(localePage.path)) {
localePage.matchPath = "/" + language + "/*";
}
createPage(localePage); This is adding matchPath to 404 pages, but results in gatsby adding every single route to This creates a single page for each language: |
looks like it's intended:
Is this really intended or a bug? |
Any updates on this? our bundle contains 2M of slugs because we have 404 page in multiple locales, and this is quite bad for web-vitals. If we leave out the matchPath the 404s still work except it will reset the url to e.g /da-dk/404. The source of this problem is matchPath when creating the 404s: Basicly: Is there not a less expensive way to not reset the urls on 404?
|
here is a small reproduction of the use case for localised 404 pages: 5 locales, 1 When I build 10000 pages per locale the resulting match-paths.json takes 90% of the bundle, so the more pages and locales we have the bigger the bundle will be code: https://github.com/kdichev/gatsby-match-paths-issue PS. If I build the project to output 100k pages the cold builds are slower with no matchPath: 82s |
I can confirm that adding a single page with matchPath (for example a 404 page with match of |
If anyone faces would like to brainstorm some possible fixes for this & work towards a solution, grab some time to meet — https://calendly.com/kyle-gatsby/30min?back=1&month=2021-05 |
I'm having the same problem. I'm creating an ecommerce website, where we pre-render some products and leave others out. Our folder structure looks something like:
The problem I'm having is that the list of all products are being added to I personally don't mind receiving a 404 for |
For those who are still having this problem, I created a plugin addressing this problem and other performance improvements. The idea is to use the server for routing. It createRedirect for the page-data.json so we don't need to ship all of these paths to the frontend. You can check this out in here: https://github.com/vtex/faststore/tree/master/packages/gatsby-plugin-performance. |
This plugin has a conflict with 'gatsby-plugin-meta-redirect'. "gatsby-plugin-meta-redirect" threw an error while running the onPostBuild lifecycle: ENOTDIR: not a directory, open '/public/page-data/en/nz/page-data.json/index.html' |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Summary
We have recently noticed our app.js bundle has shot up in size, and after closer inspection it looks like the majority of the code is data relating to matchPaths.
Relevant information
matchPath
s.createPage
.matchPath
array.I think I may have tracked down the code that does this to
getMatchPaths
ingatsby/src/bootstrap/requires-writer.js
. When I debug,matchPathPages
has a length of 25237, and then I think this matchPath data is written tomatch-paths.json
. Would this have an effect on the app.js bundle size?Environment (if relevant)
The text was updated successfully, but these errors were encountered: