Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide alternative matchPath routing strategies for different usages. #19228

Closed
mi-na-bot opened this issue Nov 3, 2019 · 8 comments
Closed
Labels
stale? Issue that may be closed soon due to the original author not responding any more. topic: performance Related to runtime & build performance

Comments

@mi-na-bot
Copy link
Contributor

mi-na-bot commented Nov 3, 2019

Summary

There are (at least) two broad categories of use cases for client side routing:

  • React apps embedded within Gatsby
  • Fallback/catch-all route for a set of static pages, like hierarchical 404s, or a dynamically rendered backup page for items sparsely built from an API.

Presently static pages within a matchPath wildcard (ie. *) route are added as individual items to match-paths, which is great for app developers because it prevents a 404 request for page-data.json for every url matched by the wildcard. Correctly handling this is essential for Gatsby to be a viable alternative to things like create-react-app.

The fallback wildcard route suffers with this strategy, because the mostly static content does not use the new page-data architecture. Instead a very large match-paths file is created. For this use case, querying for a static page first and finding a 404 for page-data.json is an exceptional event that feels semantically correct.

One strategy to help fallback wildcard users would be to provide whatever optimizations are possible, such as loading static paths in match-paths to a more a more O(log n) flavored data structure on the client side.

A much better strategy would be to recognize that the use cases are fundamentally different. Providing some annotation to indicate that a wildcard route is a backup would allow the optimal strategy to be used in all cases. It might even be possible to heuristically pick a default when a wildcard route contains a significant number of static pages.

Implementing this would be relatively simple by adding another catalog of fallback wildcard routes, such as match-paths-after containing all fallback annotated paths. Paths would then be searched: match-paths -> query page-data folder -> match-paths-after. Logic for handling 404s could be updated/replaced with match-paths-after by always placing a route (/* -> 404.js) at the end of match-paths-after.

Basic example

Pages with matchPath would be annotated in gatsby-node.

exports.onCreatePage = async ({ page, actions }) => { 
  const { createPage } = actions; 
  if (page.path.match(/^\/article-api-backup/)) { 
    page.matchPath = '/articles/:articleId';
    page.matchPathIsBackup = true;
    createPage(page); 
  } 
}; 
exports.onCreatePage = async ({ page, actions }) => { 
  const { createPage } = actions; 
  if (page.path.match(/^\/404-ru/)) { 
    page.matchPath = '/ru/*';
    page.matchPathIsBackup = true;
    createPage(page); 
  } 
}; 

Motivation

Picking just one strategy to handle wildcards will result in an expensive compromise between major groups of users. A nuanced implementation would finally eliminate all friction caused by the switch to the page-data folder.

@mi-na-bot
Copy link
Contributor Author

There was a brief discord conversation on this topic. .

Different createPage function for 404s

One suggestion was to offer an alternative action to createPage for 404-style-paths. https://discordapp.com/channels/484383807575687178/564846567463321600/640524378370473984

create404({path: "/thing/*", component: "my404.tsx", context: { whatever: "beans" }})

Allow page.matchPath to be a string or object with parameters.

@pieh very respectfully offered that the page.matchPathIsBackup syntax would be difficult to document. This would have all of the docs under createPage.matchPath.

exports.onCreatePage = async ({ page, actions }) => { 
  const { createPage } = actions; 
  if (page.path.match(/^\/404-ru/)) { 
    page.matchPath = {path:'/ru/*', backupPath:true};
    createPage(page); 
  } 
}; 

@gatsbot
Copy link

gatsbot bot commented Nov 24, 2019

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Nov 24, 2019
@pvdz
Copy link
Contributor

pvdz commented Nov 26, 2019

You're right that we need to address the large match-path.json for sites with many pages. The current situation is an organc side effect of moving to individual page jsons.

I like the idea of a manually driven api for 404-kinds of fallbacks. Especially if that means we can cut down mega jsons to a single regex ;) I don't have a thorough understanding of the entire system (in this context) yet, but from what I understand we should be able to have a set of manually crafted regexes that could go before the current match-path system, which would prevent a forest of '/path/x': 'path/x' json entries.

The docs can be solved by examples and tutorials. This is not an issue for the average user and I think that if you really want to achieve this, and spend a little time on going through the docs and examples, that you can get that to work.

@pvdz pvdz added not stale and removed stale? Issue that may be closed soon due to the original author not responding any more. labels Nov 26, 2019
@pvdz pvdz self-assigned this Nov 26, 2019
@LekoArts LekoArts added topic: performance Related to runtime & build performance and removed not stale labels May 25, 2020
@wardpeet
Copy link
Contributor

It's not as simple as you think it is. We first had a 404 fetch in place but we got a lot of pushback because of analytics projects not handling 404's correctly. More info #16097

Not saying that we shouldn't fix this but I feel the only way we can is do it on the server with plugins like gatsby-plugin-netlify, ...

@pvdz pvdz removed their assignment Sep 22, 2020
@ProteanCode
Copy link

ProteanCode commented Oct 1, 2020

It's not as simple as you think it is. We first had a 404 fetch in place but we got a lot of pushback because of analytics projects not handling 404's correctly. More info #16097

Not saying that we shouldn't fix this but I feel the only way we can is do it on the server with plugins like gatsby-plugin-netlify, ...

Can we at least have something to work with on a server side? Some header that could tell us whenever that 404 is real or not? Something like X-GATSBY-ROUTED?

Without that my prometheus / grafana dashboard would blow up with these 404s.

Edit: Actually i ran gatsby build and put it behind nginx and there is no more 404

My simplest config is:

server {
    listen 80;
    listen [::]:80;

    server_name localhost;
    root /usr/share/nginx/html;
    index index.html;

    location / {
         try_files $uri $uri/ /$is_args$args;
    }

    location ~ /\.ht {
        deny all;
    }
}

This also created an inverted problem - a real 404 are also not logged by the nginx server

@mi-na-bot
Copy link
Contributor Author

mi-na-bot commented Oct 2, 2020

It might be helpful if Gatsby could provide a standardized/documented manifest of what routing it expects in the build artifacts.

My plan, before we switched to a full SSR solution, was to determine what client-side routes existed (to send 404s) using Lambda@Edge, where we were already handling the other routing using a hacked together rewrite function.

@github-actions
Copy link

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 60 days of inactivity. It’s been at least 20 days since the last update here.
If we missed this issue or if you want to keep it open, please reply here.
As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

@github-actions github-actions bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Oct 23, 2020
@github-actions
Copy link

github-actions bot commented Dec 2, 2020

Hey again!

It’s been 60 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.
Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to comment on this issue or create a new one if you need anything else.
As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community! 💪💜

@github-actions github-actions bot closed this as completed Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale? Issue that may be closed soon due to the original author not responding any more. topic: performance Related to runtime & build performance
Projects
None yet
Development

No branches or pull requests

7 participants