Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using Whatwg url to parse relative urls received via http may fail #35458

Closed
Flarna opened this issue Oct 2, 2020 · 9 comments
Closed

using Whatwg url to parse relative urls received via http may fail #35458

Flarna opened this issue Oct 2, 2020 · 9 comments
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation.

Comments

@Flarna
Copy link
Member

Flarna commented Oct 2, 2020

  • Version: 14.13.0
  • Platform: all
  • Subsystem: url

What steps will reproduce the bug?

I used new URL() to parse the relative url received by my HTTP server as indicated in the docs. If the relative url is // (e.g. I use http://localhost:3000// in Firefox) this fails with ERR_INVALID_URL.
If the deprecated url.parse() is used it works.

Standalone reproducer:

const http = require("http")
const port = 3000
const server = http.createServer((req, res) => {
  console.log("pathname:", new URL(req.url, `http://${req.headers.host}`).pathname)
  res.end()
}).listen(port, () => http.get({ port, path: "//" }))

How often does it reproduce? Is there a required condition?

always

What is the expected behavior?

parsing works

What do you see instead?

parsing fails

Additional information

I'm not sure if this is a problem in URL parser. In case the URL parser is working as intended the doc should give some hints that using URL for parsing relative urls has some pitfalls.

Refs: #34978
Refs: #30830

@Flarna Flarna added the whatwg-url Issues and PRs related to the WHATWG URL implementation. label Oct 2, 2020
@jasnell
Copy link
Member

jasnell commented Oct 2, 2020

What are the input values for req.url and http://${req.headers.host}, respectively?

@Flarna
Copy link
Member Author

Flarna commented Oct 2, 2020

req.url is //
http://${req.headers.host} is http://localhost:3000

it seems like // is interpreted as an absolute url and therefore the second arg is ignored
Edit: I think it's interpreted as scheme-relative-special-url-string

@devsnek
Copy link
Member

devsnek commented Oct 2, 2020

We really need to just add a separate class for http request url formats. whatwg urls are not made for it.

@watilde
Copy link
Contributor

watilde commented Oct 10, 2020

Thank you for finding this. I just quickly checked with the latest reference implementation of URL and could reproduce the same error. However, it works with new URL('http://localhost:3000//') so that I just opened a new issue as feedback to ask for inputs on the upstream: whatwg/url#553.

That being said we can pass req.url as a part of input for URL to make it work:

const http = require("http")
const port = 3000
const server = http.createServer((req, res) => {
  console.log("pathname:", new URL(`http://${req.headers.host}${req.url}`).pathname)
  res.end()
}).listen(port, () => http.get({ port, path: "//" }))

/*
=> pathname: //
*/

@Flarna
Copy link
Member Author

Flarna commented Oct 10, 2020

Before building an absolute url it's needed to verify that the recevied url is relative. Absolute urls are also possible in HTTP (see https://tools.ietf.org/html/rfc7230#section-5.3)

@Flarna
Copy link
Member Author

Flarna commented Oct 19, 2020

Should we consider to adapt to to use url.parse() again even if it is deprecated? It seems to be that using URL is not a good advice in this use case.

Which usecase will not work if url.parse() is used on incoming HTTP?

@phawxby
Copy link

phawxby commented Dec 1, 2020

Generally speaking we're continuing to use url.parse because new URL API is lacking in quite a few key areas. It's not ready for prime-time yet. As far as I'm concerned the legacy API shouldn't be marked as deprecated until the performance and compatibility issues are ironed out.

@styfle
Copy link
Member

styfle commented Jan 19, 2023

Unfortunately, url.parse() is marked deprecated again

@anonrig
Copy link
Member

anonrig commented Feb 3, 2023

Closing this issue, since this is the intended behavior according to WHATWG URL specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

No branches or pull requests

7 participants