Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when http URL is missing a slash in the report #767

Closed
karlcow opened this issue Oct 14, 2015 · 10 comments
Closed

when http URL is missing a slash in the report #767

karlcow opened this issue Oct 14, 2015 · 10 comments

Comments

@karlcow
Copy link
Member

karlcow commented Oct 14, 2015

This is minor. It happened once in
webcompat/web-bugs#1807

Or logic for parsing the URL and displaying the domain in the title of an issue fails when the URL is having an incorrect http. I'm merely filling the issue to have a record of the problem. It good be a good first patch.

To make it clearer.

The person typed/reported:

https:/amazon.com

And then our code parsed it as

https: 

creating a title which is https: - layout is messed up instead of amazon.com - layout is messed up

@hallvors
Copy link
Contributor

I suggest a simple JS replace in the client-side form validation code - in the URL field change handler.

@daliacoss
Copy link
Contributor

i'll try working on this patch...

@miketaylr
Copy link
Member

The code that parses a URL into a domain lives here: https://github.com/webcompat/webcompat.com/blob/master/webcompat/form.py#L127

@miketaylr
Copy link
Member

(I would start with that, I think that's the right place -- @karlcow can confirm ^_^)

@karlcow
Copy link
Member Author

karlcow commented Mar 3, 2016

And this one will fail too.
https://github.com/webcompat/webcompat.com/blob/master/webcompat/form.py#L118

I suggest to create tests first for these, then patch. :)

@karlcow
Copy link
Member Author

karlcow commented Mar 3, 2016

  • input -> normalize -> domain
  • example.com -> http://example.com -> example.com
  • http:/example.com -> http://example.com -> example.com
  • https:/example.com -> https://example.com -> example.com
  • http:example.com -> http://example.com -> example.com
  • https:example.com -> https://example.com -> example.com
  • //example.com -> http://example.com -> example.com

@daliacoss
Copy link
Contributor

two questions:

  1. is it fair to update the "SCHEMES" tuple so that it includes the misspelled protocols, or should i create a new tuple for it?
  2. @karlcow, in your examples, the output of normalize_url always contains a trailing slash. should i add this functionality to normalize_url? (i'm assuming that you would only want to add trailing slashes if there is no path after the hostname.)

edit: also, why should "https:example.com" become "http://example.com/" but "https:/example.com" become "https://example.com/"?

@miketaylr
Copy link
Member

(I'll let @karlcow answer these questions -- he's more knowledgeable about representing URIs and URLs and HTTP, etc. Just be a little patient @deckycoss, Karl is based in Tokyo so it's his weekend right now).

@karlcow
Copy link
Member Author

karlcow commented Mar 7, 2016

  1. About Schemes tuple
    It's better to keep things separated, aka what is valid from what we are helping to fix. So I would recommend to keep the SCHEMES tuple as is.
  2. About trailing slashes
    To the best of my knowledge, many servers will try to add a trailing slash as a first step with a redirection. It's not mandatory though. But I had time to time some servers who were not answering without a trailing slash. On the other hand we might want to go along with what the user says. Let's not take care of the trailing slash and if we need to fix that we will do it later. I fixed the tests in the comment.
  3. About https/http in the edit.
    Just fixed the test for this. Thanks for catching that.
  4. the //
    Added another example. This is a valid way of making a link. it's useful when a server serves both https and http. And fortunately python knows how to deal with it.
>>> url = '//example.com/foo/bar'
>>> import urlparse
>>> urlparse.urlparse(url).netloc
'example.com'

@daliacoss
Copy link
Contributor

thanks @karlcow; this is quite helpful.

daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016
daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016
daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 7, 2016
daliacoss added a commit to daliacoss/webcompat.com that referenced this issue Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants