Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Route encoding scheme needs a rethink #7577

Closed
benmccann opened this issue Nov 9, 2022 · 1 comment · Fixed by #7644
Closed

Route encoding scheme needs a rethink #7577

benmccann opened this issue Nov 9, 2022 · 1 comment · Fixed by #7644
Labels
bug Something isn't working
Milestone

Comments

@benmccann
Copy link
Member

benmccann commented Nov 9, 2022

Describe the bug

URLs come from the browser in a semi-encoded fashion. Our router has generally used the decoded versions of things on the file system to make for nice looking files. We then have to decode the URLs in order to match what's on the file system (i.e. decodeURI). However, we have historically run decodeURIComponent on the parameters. But it turns out the way we were doing that didn't work in all cases and in fact there's no way to actually handle the param parts differently from the non-param parts because we don't know what the param parts are until after the matching has been run which requires decoding first

I looked to see how Next handles this and apparently they don't as it's not documented and is one of the top bugs in their repo: vercel/next.js#10084

There are some fundamental limitations in mapping between a URL and file system since they have different limitations. Here are the various possible combinations:

  • Case 1: legal on FS. not encoded. (e.g. A)
  • Case 2: illegal on FS. not encoded. (e.g. *)
  • Case 3: legal on FS. sent encoded (e.g. )
  • Case 4: illegal on FS. sent encoded (e.g. <)
  • Case 5: kit special character (e.g. [)
  • Case 6: URL encoding character (i.e. %)

With the characters that are illegal to put on the file system or are kit special characters, we have two choices: make them encoded on the FS or only allow them in parameters. Let's choose the latter as it will constrain our design less, make foreign characters much nicer, and be less breaking than forcing URLs to be encoded.

Now the big question is how can we decodeURIComponent URL parameters?

For reference, the list of characters replaced by decodeURIComponent but not decodeURI is:

  • / - illegal on FS. must be sent encoded
  • ? - illegal on FS. must be sent encoded
  • : - can be illegal on FS
  • # - must be sent encoded. this needs to be made illegal in routes eventhough the FS accepts it
  • ;
  • ,
  • @
  • &
  • =
  • +
  • $

We can't just make these characters illegal, because here we're talking about URL parameters and we need to be able to represent anything. We can't simply do a decodeURIComponent upfront and then match via regex because of stuff like / (imagine the route /a/[b]/c with the URL /a/b%2fc). I'll also note here that special casing stuff like / or %2f is the road to hell and is how we're in our current predicament (#7521 tried to special case %25), so we very much don't want to do that either.

Suggested fix

We can decodeURI up-front once and match with regex. But then when we extract the parameters we need to do it from the original URL instead of the decoded URL. We were very close to doing the right thing before #7521 except for that last part - we were trying to extract the parameters from the already decoded URL and that won't work. The other change we should implement is to make the following characters illegal outside of route parameters: #<>:"/\|?*. While technically some of these would work on Linux and Mac, it's probably not a good idea to allow routes that would break if ever run on Windows. I also think we should say that if you want to use [ it should be inside a parameter (essentially done by #7550, but we may need doc upstead)

Reproduction

Logs

No response

System Info

`master`

Severity

ugghhhh

Additional Information

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants