Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP::URI.parse doesn’t set default port #297

Closed
sferik opened this issue Jan 19, 2016 · 14 comments
Closed

HTTP::URI.parse doesn’t set default port #297

sferik opened this issue Jan 19, 2016 · 14 comments
Labels
Milestone

Comments

@sferik
Copy link
Contributor

sferik commented Jan 19, 2016

The http gem changed to how it creates URIs between version 0.7 and 0.8. Versions 0.7 and earlier used the standard URI library, while more recent versions use a custom HTTP::URI, which descends from Addressable::URI.

After upgrading http dependency in the twitter gem from 0.6 to 0.8 (and now to 1.0), some code that expected the default port to be set is now broken. You can read a detailed write up of the issue, with the relevant portion reproduced below:

http gem 0.7.x and earlier
[38] pry(main)> uri = URI('https://stream.twitter.com')
=> #<URI::HTTPS:0x007fa54f8d38e8 URL:https://stream.twitter.com>
[39] pry(main)> uri.port
=> 443
http gem 0.8.x and later
[40] pry(main)> uri = HTTP::URI.parse('https://stream.twitter.com').normalize
=> #<Addressable::URI:0x3fd2a57bd650 URI:https://stream.twitter.com/>
[41] pry(main)> uri.port
=> nil

@tarcieri @zanker @ixti How would you feel about adding back the default ports (80 for HTTP and 443 for HTTPS) if an explicit port is not specified?

Quoting from RFC 7230, section 2.7.1 on http URI Scheme:

If the port subcomponent is empty or not given, TCP port 80 (the reserved port for WWW services) is the default.

And the subsequent section on the https URI Scheme:

All of the requirements listed above for the "http" scheme are also requirements for the "https" scheme, except that TCP port 443 is the default if the port subcomponent is empty or not given…

If we all agree this is the correct behavior, I will work on a patch.

@tarcieri
Copy link
Member

Definitely sounds like the correct behavior to me, and a bug in Addressable...

@sferik
Copy link
Contributor Author

sferik commented Jan 19, 2016

@tarcieri I just reading the Addressable source code. They solve this by providing default_port and inferred_port instance methods on Addressable::URI. I guess it’s okay that we inherit that interface for HTTP::URI. It was just surprising to me that the port method would return nil when a port is not set explicitly, especially since that was not the behavior of earlier versions of this library. To me, it would make more sense to have an explicit_port method that would be nil if the port is not set explicitly.

@zanker
Copy link
Contributor

zanker commented Jan 19, 2016

I'd agree with @tarcieri. It's annoying that it broke between the two, but returning a nil makes sense to me. Having an explicit method is 👍

@tarcieri
Copy link
Member

@sferik hmm, I think it's a bit strange that HTTP::URI diverges from URI like that. There was also some recent discussion about removing Addressable due to performance reasons.

@ixti
Copy link
Member

ixti commented Jan 19, 2016

@sferik We can do that in our inherited URI class

@ixti
Copy link
Member

ixti commented Jan 19, 2016

@tarcieri re removing Addressable. Before doing that we should first do our own research.
Because Addressable does pretty awesome job in:

  • IDNA support Addressable::URI.parse "http://привет-мир.рф"
  • UTF support https://en.wikipedia.org/wiki/A_Coruña (see: Add support for non-ascii URIs #197)
  • nice utilities like #join

Yes, Addressable::URI.parse slower than URI.parse.
Yes Addressable takes pretty much RAM 1.6MB...

But I don't see how something that simply does not fit our need but lightweight might be better ;))

@tarcieri
Copy link
Member

@ixti sure, I don't think there's even an open issue about removing Addressable, just some discussion. I was just throwing it out there as something that has come up in the past.

@ixti
Copy link
Member

ixti commented Jan 20, 2016

@tarcieri Yeah. I guess I wrote that up mostly for myself ;)) as in that discussion I was on the side of removing it :D

@ixti
Copy link
Member

ixti commented Jan 20, 2016

For reference. Discussion @tarcieri and I were talking about is: #197 (comment)

@tarcieri
Copy link
Member

@sferik do you still consider this an open problem? Should we perhaps open a new one about getting rid of Addressable?

@ixti getting rid of Addressable is something we should consider before a 2.x release...

This was referenced Mar 19, 2016
@tarcieri
Copy link
Member

I made an issue for a 2.0.0 release: #321

@sferik
Copy link
Contributor Author

sferik commented Mar 20, 2016

@tarcieri Thanks. Other than this, I don’t have any 2.0 blockers.

@tarcieri tarcieri added this to the v2.0 milestone Mar 20, 2016
@tarcieri tarcieri added the Bug label Mar 20, 2016
@ixti
Copy link
Member

ixti commented Mar 20, 2016

I don't think we should get rid of Addressable completely. Maximum what I think we should do (if we want to get rid of hard addressable dependency) is to provide our own class for URI, that will use internally stdlib URI by default and will be able to handle URIs with addressable: HTTP::URI.use :addressable.

@tarcieri
Copy link
Member

Yeah, let's get rid of HTTP::URI < Addressable::URI at least. It should use object composition instead of a subclass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants