URL.canParse() #713

hemanth · 2022-10-17T22:01:25Z

It would be good to have a URL.isValid method?

Today we have to do something like:

const isValidUrl = (s) => {
    try {
      new URL(s);
      return true;
    } catch (err) {
      return false;
    }
};

The proposal is to have something like:

URL.isValid(<URL>, [protocol1, protocol2]);

URL.isValid('https://h3manth.com'); // true
URL.isValid('https://h3manth.com', ['file']); // false

The text was updated successfully, but these errors were encountered:

jasnell · 2022-10-17T22:17:50Z

"Valid" here will need to be defined as the spec defines a clear notion of a "validation error". Specifically, a URL can parse successfully but still be considered "invalid" per the spec. Does "isValid(...) === true" mean that the URL parses successfully or does it mean it parsed without any validation errors?

jasnell · 2022-10-17T22:24:47Z

As far as bikeshedding the API is concerned, I would likely prefer something more like:

URL.test(input[, base]);

With the semantics specifically only limited to: Will this parse successfully.

domenic · 2022-10-18T01:12:20Z

Please see https://whatwg.org/faq#adding-new-features, especially step 1 and 2.

annevk · 2022-10-18T06:40:19Z

With the semantics specifically only limited to: Will this parse successfully.

How would that be different from what new URL() does?

jasnell · 2022-10-18T12:43:38Z

How would that be different from what new URL() does?

The key difference would be no allocations, in particular no Error object created if it does not parse, and therefore no expensive stack capture.

hemanth · 2022-11-04T20:30:11Z

@domenic Should we provide more use-cases here? All that we are trying to solve here is to have a convenient way to identify if a given URL is valid or not.

@jasnell test reminds me of RegExp.prototype.test like testing against something, isValid sounds more intuitive?

Either way if we agree this is a problem space, happy to discuss more.

annevk · 2022-11-05T06:46:10Z

@hemanth well it would help if you would first clarify what semantics you are after as validity and "can parse" are very different things. (But yeah, depending on the answer, use cases and demonstrating there is a need would help.)

jasnell · 2022-11-05T12:42:41Z

The primary use case I've encountered for this is really functions like the isValid() in the OP at the top. Specifically, code that wishes to verify that a given input can be handled as a URL (can parse) without incurring the additional cost of creating either a URL object or an Error object with an expensive to generate stack. I've encountered this pattern a number of times.

Most often, I'd say, the check is performed using a regex, which is problematic in that it doesn't match the parsing semantics and can be difficult to make correct. There's an old module on npm called is-url that has over 2.5 million downloads per week to address this specific problem using an incomplete regex approach.

For me, I would 100% support an API that checks if an input can be parsed as a URL. I have no use case for further validity checks.

Alternatively, I would be good with a new URL constructor variation that did not throw on unparseable input so as to avoid the expensive operation to acquire the stack.

hemanth · 2022-11-05T21:45:51Z

@annevk My thoughts resonated completely with @jasnell. There is also valid-url another node module which is dependent on regex and got around 2,788,820 downloads last week.

This is a very common pattern across multiple code base and would be beneficial to multiple developers to have a connivence method with in the standards.

If there is a template repo that I can fork and add more structure to the challenge and propose potential solutions, I am happy to collaborate.

annevk · 2022-11-07T09:03:24Z

See also #372 although that would not return a boolean presumably.

Given that there's popular libraries that do something like this this is worth doing I think, but this is not validation. We should either call this something like canParse and return a boolean or parse and return null or a URL object.

domenic · 2022-11-07T09:10:58Z

It's not clear to me solving this at the level of a new API is the best idea. Wouldn't it be better to make try / catch faster? In particular, if you use catch { } instead of catch (e) { }, that should be a reliable signal to the engine that there's no need to construct a stack, which is apparently expensive (benchmarks pending).

ljharb · 2022-11-07T23:47:37Z

That wouldn’t likely help you if an error is caught by a calling function, or an async function, or any of the Promise methods.

domenic · 2022-11-08T00:30:47Z

Right, you would put the catch there immediately surrounding the new URL call, just like you would put the URL.isParseable call there immediately surrounding the string input.

jasnell · 2022-11-08T00:37:00Z

I'd argue that it's definitely better to make capturing that stack more efficient just in general but I don't think that eliminates to usefulness of the proposed new API. For instance, suppose I am filtering a list of strings to acquire only the ones that are parseable URLs e.g. array.filter(URL.isParseable) is a lot more ergonomic than having to wrap the check in another function that includes a try/catch for every iteration.

ljharb · 2022-11-08T00:46:29Z

@domenic that would certainly be an interesting unobservable optimization, for a lot of patterns - but i don't think the real problem here is performance, it's ergonomics. try/catching something is a wildly icky way to get a boolean.

karwa · 2022-11-08T02:18:14Z

For the ergonomics of try/catch, perhaps that's something that should be solved at the language level? For instance, some languages have a try? feature which returns nil on error. Would this be useful for other, non-URL APIs?

ljharb · 2022-11-08T02:40:44Z

There's not that many predicates that need to be implemented in the form of a try/catch, so i'm not sure if that'd be worth additional syntax.

annevk · 2023-03-06T02:22:37Z

@domenic do your comments above mean you disagree with #372 (comment) now?

I find @jasnell's comment above about being able to use a non-throwing method in a number of scenarios compelling.

Also, https://www.npmjs.com/package/valid-url, https://www.npmjs.com/package/is-url-superb, and https://www.npmjs.com/package/is-url suggest an authoritative method would be useful for the ecosystem.

@valenting @ricea @achristensen07 thoughts?

ricea · 2023-03-06T06:34:47Z

I'm not convinced that canParse() is the check that the people using those libraries really want. For example, we can parse a::<!>:: as a URL, but you probably wouldn't want to accept that in an input field.

In the absence of benchmarks I'm also not convinced by the performance argument.

Edit: Actually, even if there were benchmarks, why do we think this is performance critical? Who needs to verify whether a million URLs parse anyway?

Fixes #713.

annevk · 2023-03-16T17:43:26Z

I chatted with @ricea about this and convinced him those libraries do in fact want this functionality.

PR: #763.

Tests: web-platform-tests/wpt#39069. Fixes #713.

yisibl · 2024-04-17T12:00:42Z

If you have the following code, the url actually needs to be parsed twice.

const url = 'http://foo.com'
let parsed_url = null
if (URL.canParse(url)) {
  parsed_url = new URL(url)
}

Looking back, now that we've added URL.parse() to return null if it's invalid(#825), why do we need URL.canParse()?

annevk · 2024-04-17T14:11:33Z

I'm not a big fan of folks resurrecting closed issues. Please consider filing a new issue in the future if there's something you wish to discuss.

As for your question, I don't think that's the only case where URL.canParse() is useful. E.g., if you have to decide whether to store a given string based on whether or not it parses as a URL. URL.canParse() is nice there as it doesn't require allocating any URL object.

annevk added needs implementer interest Moving the issue forward requires implementers to express interest addition/proposal New features or enhancements topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing) labels Oct 18, 2022

annevk removed the topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing) label Nov 7, 2022

azu mentioned this issue Dec 16, 2022

「Check URL is valid URL」 is invalid azu/url-cheatsheet#9

Closed

annevk added the topic: api label Mar 6, 2023

annevk changed the title ~~Proposal: URL.isValid~~ URL.canParse() Mar 6, 2023

annevk added a commit that referenced this issue Mar 16, 2023

Add URL.canParse()

e3e101a

Fixes #713.

annevk mentioned this issue Mar 16, 2023

Add URL.canParse() #763

Merged

4 tasks

annevk removed the needs implementer interest Moving the issue forward requires implementers to express interest label Mar 20, 2023

annevk closed this as completed in #763 Mar 20, 2023

annevk added a commit that referenced this issue Mar 20, 2023

Add URL.canParse()

ae3c28b

Tests: web-platform-tests/wpt#39069. Fixes #713.

shgysk8zer0 mentioned this issue Jun 2, 2023

Add URL.canParse() shgysk8zer0/polyfills#15

Closed

Jamesernator mentioned this issue Mar 31, 2024

url: implement parse method for safer URL parsing nodejs/node#52280

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

URL.canParse() #713

URL.canParse() #713

hemanth commented Oct 17, 2022

jasnell commented Oct 17, 2022

jasnell commented Oct 17, 2022

domenic commented Oct 18, 2022

annevk commented Oct 18, 2022

jasnell commented Oct 18, 2022

hemanth commented Nov 4, 2022

annevk commented Nov 5, 2022 •

edited

Loading

jasnell commented Nov 5, 2022

hemanth commented Nov 5, 2022

annevk commented Nov 7, 2022

domenic commented Nov 7, 2022

ljharb commented Nov 7, 2022

domenic commented Nov 8, 2022

jasnell commented Nov 8, 2022

ljharb commented Nov 8, 2022

karwa commented Nov 8, 2022 •

edited

Loading

ljharb commented Nov 8, 2022

annevk commented Mar 6, 2023

ricea commented Mar 6, 2023 •

edited

Loading

annevk commented Mar 16, 2023

yisibl commented Apr 17, 2024 •

edited

Loading

annevk commented Apr 17, 2024

URL.canParse() #713

URL.canParse() #713

Comments

hemanth commented Oct 17, 2022

jasnell commented Oct 17, 2022

jasnell commented Oct 17, 2022

domenic commented Oct 18, 2022

annevk commented Oct 18, 2022

jasnell commented Oct 18, 2022

hemanth commented Nov 4, 2022

annevk commented Nov 5, 2022 • edited Loading

jasnell commented Nov 5, 2022

hemanth commented Nov 5, 2022

annevk commented Nov 7, 2022

domenic commented Nov 7, 2022

ljharb commented Nov 7, 2022

domenic commented Nov 8, 2022

jasnell commented Nov 8, 2022

ljharb commented Nov 8, 2022

karwa commented Nov 8, 2022 • edited Loading

ljharb commented Nov 8, 2022

annevk commented Mar 6, 2023

ricea commented Mar 6, 2023 • edited Loading

annevk commented Mar 16, 2023

yisibl commented Apr 17, 2024 • edited Loading

annevk commented Apr 17, 2024

annevk commented Nov 5, 2022 •

edited

Loading

karwa commented Nov 8, 2022 •

edited

Loading

ricea commented Mar 6, 2023 •

edited

Loading

yisibl commented Apr 17, 2024 •

edited

Loading