when http URL is missing a slash in the report #767

karlcow · 2015-10-14T23:31:24Z

This is minor. It happened once in
webcompat/web-bugs#1807

Or logic for parsing the URL and displaying the domain in the title of an issue fails when the URL is having an incorrect http. I'm merely filling the issue to have a record of the problem. It good be a good first patch.

To make it clearer.

The person typed/reported:

https:/amazon.com

And then our code parsed it as

https:

creating a title which is https: - layout is messed up instead of amazon.com - layout is messed up

The text was updated successfully, but these errors were encountered:

hallvors · 2015-10-16T14:25:10Z

I suggest a simple JS replace in the client-side form validation code - in the URL field change handler.

daliacoss · 2016-03-03T22:37:58Z

i'll try working on this patch...

miketaylr · 2016-03-03T22:38:10Z

The code that parses a URL into a domain lives here: https://github.com/webcompat/webcompat.com/blob/master/webcompat/form.py#L127

miketaylr · 2016-03-03T22:40:16Z

(I would start with that, I think that's the right place -- @karlcow can confirm ^_^)

karlcow · 2016-03-03T22:48:53Z

And this one will fail too.
https://github.com/webcompat/webcompat.com/blob/master/webcompat/form.py#L118

I suggest to create tests first for these, then patch. :)

karlcow · 2016-03-03T22:52:29Z

input -> normalize -> domain
example.com -> http://example.com -> example.com
http:/example.com -> http://example.com -> example.com
https:/example.com -> https://example.com -> example.com
http:example.com -> http://example.com -> example.com
https:example.com -> https://example.com -> example.com
//example.com -> http://example.com -> example.com

daliacoss · 2016-03-04T04:26:57Z

two questions:

is it fair to update the "SCHEMES" tuple so that it includes the misspelled protocols, or should i create a new tuple for it?
@karlcow, in your examples, the output of normalize_url always contains a trailing slash. should i add this functionality to normalize_url? (i'm assuming that you would only want to add trailing slashes if there is no path after the hostname.)

edit: also, why should "https:example.com" become "http://example.com/" but "https:/example.com" become "https://example.com/"?

miketaylr · 2016-03-04T16:43:37Z

(I'll let @karlcow answer these questions -- he's more knowledgeable about representing URIs and URLs and HTTP, etc. Just be a little patient @deckycoss, Karl is based in Tokyo so it's his weekend right now).

karlcow · 2016-03-07T00:50:28Z

About Schemes tuple
It's better to keep things separated, aka what is valid from what we are helping to fix. So I would recommend to keep the SCHEMES tuple as is.
About trailing slashes
To the best of my knowledge, many servers will try to add a trailing slash as a first step with a redirection. It's not mandatory though. But I had time to time some servers who were not answering without a trailing slash. On the other hand we might want to go along with what the user says. Let's not take care of the trailing slash and if we need to fix that we will do it later. I fixed the tests in the comment.
About https/http in the edit.
Just fixed the test for this. Thanks for catching that.
the //
Added another example. This is a valid way of making a link. it's useful when a server serves both https and http. And fortunately python knows how to deal with it.

>>> url = '//example.com/foo/bar'
>>> import urlparse
>>> urlparse.urlparse(url).netloc
'example.com'

daliacoss · 2016-03-07T19:48:47Z

thanks @karlcow; this is quite helpful.

…ning with //

…rly spelled

karlcow added type: bug prio: good first bug labels Oct 14, 2015

daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016

Issue webcompat#767: fixes normalize_url bug when url is missing slashes

71ed381

daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016

Issue webcompat#767: don't add trailing slash; account for urls begin…

0da59cc

…ning with //

daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016

Issue webcompat#767: removed extra whitespace in test_form.py

6801fe8

daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 8, 2016

Issue webcompat#767: normalize_url doesn't break when scheme is prope…

54a7d0a

…rly spelled

miketaylr closed this as completed in 7aa4476 Mar 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

when http URL is missing a slash in the report #767

when http URL is missing a slash in the report #767

karlcow commented Oct 14, 2015

hallvors commented Oct 16, 2015

daliacoss commented Mar 3, 2016

miketaylr commented Mar 3, 2016

miketaylr commented Mar 3, 2016

karlcow commented Mar 3, 2016

karlcow commented Mar 3, 2016

daliacoss commented Mar 4, 2016

miketaylr commented Mar 4, 2016

karlcow commented Mar 7, 2016

daliacoss commented Mar 7, 2016

when http URL is missing a slash in the report #767

when http URL is missing a slash in the report #767

Comments

karlcow commented Oct 14, 2015

hallvors commented Oct 16, 2015

daliacoss commented Mar 3, 2016

miketaylr commented Mar 3, 2016

miketaylr commented Mar 3, 2016

karlcow commented Mar 3, 2016

karlcow commented Mar 3, 2016

daliacoss commented Mar 4, 2016

miketaylr commented Mar 4, 2016

karlcow commented Mar 7, 2016

daliacoss commented Mar 7, 2016