Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: add some URL parsing examples #177

Merged
merged 3 commits into from
Dec 28, 2016
Merged

Conversation

annevk
Copy link
Member

@annevk annevk commented Dec 19, 2016

Fixes #119.

@annevk
Copy link
Member Author

annevk commented Dec 19, 2016

@domenic feel free to add more examples in new commits.

<td>
<td>Failure
<tr>
<td><code>https:example.invalid</code>
Copy link
Member

@domenic domenic Dec 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to self: we should add this example but against about:blankno base URL since I am pretty sure (will check) that it results in https://example.invalid/.

@annevk
Copy link
Member Author

annevk commented Dec 19, 2016

I wonder if we should move this elsewhere and add an additional column to this table with respect to validity. Or perhaps <mark> the invalid portions of input.

Then we could also address #118.

Maybe after "shorten a url’s path"?

@domenic
Copy link
Member

domenic commented Dec 19, 2016

OK, added a number of my favorite examples. There were certainly more (e.g. demonstrating some of the host normalization like IDNA and IPv4/v6 parsing would have been cool; demonstrating file URLs would probably help) but this I think hammers home the main points about how a large variety of weird URLs can parse, but some things can still cause failure.

Regarding validity. I think the current location is pretty good. Maybe what we need is a paragraph spelling out the fact that URL syntax is all about validity, and what validity means in practice. That paragraph could link to this example, perhaps.

Adding a column for validity seems OK but we might need some more valid examples in that case, as I emphasized the invalid ones. Also I'm not sure it applies that well since we talk about validity on an absolute URL, not on an input to the URL parser, right? So e.g. the row that lists example + https://example.com/demo would not work out great, would it?

<tr>
<td><code>example</code>
<td><code>https://example.com/demo</code>
<td><code>https://example.com/example</code>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was upfront because it demonstrates a rather simple thing that everyone takes for granted.

Copy link
Member

@domenic domenic Dec 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just reorged into no base, then base, then failures; it seems kind of nice that way. I don't feel strongly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm okay.

@annevk
Copy link
Member Author

annevk commented Dec 19, 2016

Validity would be about the input column. Though it could be a column on its own describing the issue.

@domenic
Copy link
Member

domenic commented Dec 19, 2016

I think it would be kind of confusing to say that example is invalid input in the example + https://example.com/demo demo. (That is, example by itself does not correspond to the URL syntax definitions.) So maybe a separate table of valid and invalid URLs would be good.

@annevk
Copy link
Member Author

annevk commented Dec 19, 2016

example by itself matches https://url.spec.whatwg.org/#syntax-url-path-relative which certainly is a URL string.

@domenic
Copy link
Member

domenic commented Dec 19, 2016

Oh, I didn't realize; very interesting. I think that's not usually what people mean by "valid URL" but it does have a rigorous definition, so good enough for me.

@annevk
Copy link
Member Author

annevk commented Dec 19, 2016

There is some handwaving around base URL which at some point should be resolved. Since to be a valid relative-URL string, we do need a base URL.

@annevk annevk merged commit 0f54bdf into master Dec 28, 2016
@annevk
Copy link
Member Author

annevk commented Dec 28, 2016

I decided to go ahead and merge this since it's an improvement over the status quo and the validity question deserves a bit more thought and study.

@annevk annevk deleted the annevk/example-url-parsing branch December 28, 2016 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants