-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) Refactor HTTP::Client to use a transport #6001
(WIP) Refactor HTTP::Client to use a transport #6001
Conversation
It seems Transport is also a thing in Go. Since we are copying Go so much, I can't understand why so many of you are against the idea of making Crystal more typed and more similar to other typed languages. |
@asterite I'm not sure if your comment is sarcastic or just off-topic. But yes, Go has such a transport as well and so have other HTTP clients in other languages. At least this has nothing to do with the type system. I was just compiling a list of similar implementations for reference:
|
Cool, I thought Transport was Go-exclusive. |
|
Internally, It would be possible to provide a default argument for |
Huh? |
client = HTTP::Client.new(tls: true)
client.get("https://crystal-lang.org/")
client.get("http://crystal-lang.org/") # -> this wouldn't work with the previous meaning of `tls` |
Strange, I thought we only allowed URIs, not full URLs, when initializing a client. But maybe it makes sense, I don't know. In the old ways it used to connect to a server with that base URL, I don't know what's going on now, but it's probably okay... |
Not sure what you mean... URI is a superset of URL. But right, currently every client = HTTP::Client.new
client.get("https://crystal-lang.org/")
client.get("https://github.com/crystal-lang") Or keep the current behaviour to connect to one server: client = HTTP::Client.new("crystal-lang.org", 443)
client.get("/")
client.get("/docs") |
I love the first example more client = HTTP::Client.new
client.get("https://crystal-lang.org/")
client.get("https://github.com/crystal-lang") |
IMO that's too common use-case, why not use btw, tls with port |
@bararchy It depends on the use case. If you just need to query one API endpoint, it's much easier to use a client that's specifically connected to that endpoint. @Sija Because the constructor HTTP::Client.new(URI.parse("https://example.com")) # connects to default port 443
HTTP::Client.new(URI.parse("http://example.com:1234")) # connects to port 1234 |
So how does keep-alive work? |
@asterite I think that should be implemented in the transport. That's why it probably makes sense to read the response there (last paragraph in the OP). For now, the |
IMHO: this sounds awfully more complex than necessary.
No need to over-engineer Please, instead of pushing some idea with code to review, open an issue with detailed limitations and problems with |
@ysbaddaden This PR is a proof of concept and not intended to be reviewed. It's far to incomplete for that anyway. I've just been exploring this approach to improve and simplify the http client implementation. And I wanted to share that and use it to support a discussion. Maybe it would have been better to open an issue and posted the code parallel to that. In fact, I can still do that.
|
It's a bit strange because in Go a Transport (well, RoundTripper) takes a Request and returns a Response, not an IO. I think Transport will make it easier to test an HTTP::Client, too. Right now there are no (any?) tests for HTTP::Client. |
I really don't want to make things more complex here either. Why isn't If we want middleware - which is looks to me is what you really want - let's implement middleware not this. And lastly, @asterite, I work on crystal because it's not just another typed language. Making crystal into just another normal typed language, even with type inference, would completely destroy any advantage crystal has and be completely tone-deaf to the audience of crystal. |
Passing the client simply an IO can't be enough. How would you make simultaneous requests over the same client if it can't create a new IO itself? Or re-use connections to different servers? Middleware is related but not dependend on this nor could this be implemented using middleware unless you would use the middleware to completely hijack the requests-response roundtrip. |
@straight-shoota It seems there's a misunderstanding about how
This wasn't intended in the original design. You can have one request per time. To execute a next request you must consume the last response. That's also why "keep-alive" is important here: with one server, "keep-alive" is true by default (depending on the HTTP protocol of the server), and so you can keep that connection alive. It seems this PR is completely changing how That's also why doing this: client.get("http://...") makes no sense to me. You are targeting a single server, you don't need to repeat the scheme, host, etc. Just send the path. If you want to redesign |
That's why we need a pool of |
Indeed, it's far easier to think of a pool of http clients, instead of a http client which contains a pool. |
@asterite I took it that connecting to a single server was only a temporary limitation because a proper implementation is missing, but not as the final design for The class-level interface of The solution currently would to create instances of These are all features that a proper HTTP client should have in my opinion. Given that the use case "concurrently request HTTP resources from the same or different servers" does not sound pretty rare for typical Crystal applications, I am convinced that a HTTP client implementation in the stdlib should be able to support that. |
You are probably right, though I wouldn't know how to implement keep alive that way. And then an instance of an HTTP Client probably makes no sense if we go that way, because it's more useful to use the class methods. If someone wants to implement this, go ahead. |
Yes, the implementation of the connection pool is going to be very complex. I've looked through Go's I think instances of |
@straight-shoota what you want isn't a HTTP client, but a general Web Browser-like API (or What we mean with HTTP Client could be confused with a HTTP Connection, but there are still differences. For example in HTTP/2 the Connection is generic, then a Client and Server can be built on top of it (they behave differently), yet they still act over a single HTTP connection, not over many. It's important to not merge all these layers (client, protocol, connection, io), and more —the What you want to achieve is another layer, supporting different schemes such as |
@ysbaddaden I'm starting to agree with @straight-shoota . At least Go's http client can be used by multiple fibers and it works fine, so you can have multiple concurrent requests, something that right now doesn't work well... and if we don't do that, then any class that wraps an API won't be concurrent by default, and all of that logic must be implemented in each case, which is a burden. I think @waj was also interested in having HTTP::Client have a pool of connections, though I don't think he'll have time to do this, but maybe he has some time to comment on this. |
I'm starting to want to clean-sheet the whole of |
I like this idea. I have been swapping out or monkey patching the socket to do some fun things. but I would be interested in where this goes. I am still a little fuzzy on all the features that you intend to get out of this. Is it fair to say you want:
|
#codetriage This PR appears to have been a good discussion point months ago, and stalled. |
I think this can be closed. Let's continue the discussion in #6011 |
This PR is a first step to refactor HTTP::Client to make it more customizable by extracting the actual connection management to a
Transport
class.Every
HTTP::Client
uses a configurable transport to establish the actual IO connection. By default, this isHTTP::Client.default_transport
which is an instance ofDefaultTransport
and basically enables connections to any TCP endpoint (the implementation is currently very rudimentary).If
HTTP::Client
is initialized with ahost
andport
argument, it will use aTCPTransport
connecting to a fixed endpoint. This is basically the behaviour as it was until now: the Client instance is bound to a specific connection.The static methods like
HTTP::Client.get
will useHTTP::Client.default_client
by default. But they can also receive atransport
argument and create a custom client using that transport (thus it won't be reused).This proposal allows for more advanced features to be added to HTTP::Client, which should rather be a high-level interface while
Transport
does the heavy lifting.One such feature is to make requests to Unix sockets (#2735) which is already implemented in this PR.
Further enhancements would be connecting through a proxy transport as well as an advanced connection management for
DefaultTransport
which can pool and re-use connections.While still work in progress I'd like to get some feedback on the general design. Please don't comment on implementation details, I know there are many rough edges. The docs are not updated yet either.
The interface between
HTTP::Client
andTransport
probably needs to be more refined.Currently, I've basically extracted the part where a
TCPSocket
was created and replaced it withtransport.connect(uri, request)
. The transport establishes an appropriate connection and returns the IO. The decision where to connect can be based on the URI and request.I'm contemplating about moving the request and response serialization entirely to the transport. This would move more low level stuff out of the client. It would also allow the transport to easily return responses without reading them from an IO. Practical applications for this are for example a response cache or implementing a transport for the
file
scheme to serve local files as HTTP responses directly without serializing through an IO.