-
Notifications
You must be signed in to change notification settings - Fork 125
Upload streaming #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upload streaming #37
Conversation
Motivation: Users may want to optimize memory usage when executing request with big bodies, same as streaming download Modifications: Added ChunkProvider typealias in HTTPHandler.swift Added new Body enum case - stream in HTTPHandler.swift Extracted body processing in HTTPHandler.swift to a separate method Added .stream enum processing in HTTPHandler.swift Added upload streaming test to SwiftNIOHTTPTests.swift Result: HTTPClient library now provides methods to stream Request body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that looks like a great start. I think we need to invest a bunch more in testing this functionality though.
@@ -322,4 +322,65 @@ class SwiftHTTPTests: XCTestCase { | |||
let res = try httpClient.get(url: "https://test/ok").wait() | |||
XCTAssertEqual(res.status, .ok) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a bunch of other tests here:
- upload/download interrupted
- failed future returned from uploading & downloading
- a test that shows that the back-pressure is working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added failure test and backpressure test. What do you mean by interruption? External cancellation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be missing something... I thought didReceivePart()
was for streaming incoming body data (e.g. GET response). Are you intending to also use this callback for streaming outgoing body data (e.g. POST)? If so, I think that is confusing to the user...
@ianpartridge no, you're not missing anything but it's important to propagate back-pressure. Let's assume we have three different nodes A, B, and C with two active HTTP connections:
where Now let's assume we're How can we (as func didReceivePart(data: ...) {
otherConnection.sendData(data) // blocks until `data` is fully sent
} for asynchronous programming we can express the same concept as func didReceivePart(data: ...) -> EventLoopFuture<Void> {
return otherConnection.sendData(data) // returns a future that is fulfilled when the data has been sent
} in other words: The whole thing totally works iff the API allows us to indicate when we're ready for the next chunk and that's done with the change to have And yes, this isn't actually related to upload streaming but when I was looking at the back-pressure for the upload streaming I noticed that download streaming doesn't actually support back-pressure and therefore it's in this PR :). |
OK, thanks so much for the awesome explanation. So, if I'm understanding correctly, the new
Thoughts... Is the reason the future is optional because it avoids the cost of creating an already succeeded future in the case where the user doesn't care about propagating backpressure? Otherwise the future could be non-optional and we just ask the user to create a succeeded future. Would returning an enum with two cases ( |
I'd missed this, thanks for pointing this out @ianpartridge:
I don't think we should allow a |
Hah, glad you point that out! I had no idea it was optional, I don't think it should be :). @artemredkin was that deliberate?
This would be slightly cheaper at run-time (than always allocating) in the case where we don't need back-pressure (rare!) and as you point out make the API less clear. This feels like premature optimisation to me. |
It was deliberate, yes. I'll change it to non-optional |
Actually, reason I made it's optional is that there is no easy way to create completed future without passing in eventLoopGroup... |
In general I think that's an acceptable limitation. I'm happy with the idea that any reasonable delegate implementation will need access to an event loop. |
@Lukasa I wanted delegates to have simple inits without dependencies... |
Neither: I think both will be misused. If we really think it's unacceptable to require the delegate have access to an event loop then I prefer non-optional promise to the enum unless the empty case is |
Why? |
I think having something like: let client = HTTPClient(...)
let delegate = CountingDelegate()
client.execute(request: request, delegate: delegate) is slightly better than: let client = HTTPClient(...)
let delegate = CountingDelegate(eventLoopGroup: client.eventLoopGroup)
client.execute(request: request, delegate: delegate) but I see your point. |
Note that you only need |
Task has an optional Channel unfortunately, but its easy to add EventLoop there |
Added EventLoop to Task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, a pile more comments. I hope they are useful.
case .string(let string): | ||
return string.utf8.count | ||
struct Body { | ||
typealias ChunkProvider = (@escaping (IOData) -> EventLoopFuture<Void>) -> EventLoopFuture<Void> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should call this PartProvider
- for symmetry with didReceivePart()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the typealias
holds its weight actually...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote without typealias, seems to be fine
return data.count | ||
case .string(let string): | ||
return string.utf8.count | ||
struct Body { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is really a personal preference (although I think NIO style might mandate this) but I much prefer having explicit access modifiers, and just extension
instead of public extension
. I'm not clever enough to remember Swift's access control rules without having to think twice every time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, fixed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please, I think there should never be any word in front of extension
. public extension
for example is really bad because it means that you can add public functions without the word public
appearing in a diff and that's an issue for all SemVer stuff.
@@ -207,13 +224,15 @@ internal extension URL { | |||
|
|||
public extension HTTPClient { | |||
final class Task<Response> { | |||
let eventLoop: EventLoop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be public
. If I'm correct the tests are only compiling because you have @testable
which users won't have...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, if we can it might be good to remove @testable
from the test files... But I know it's not always that straightforward. Another option might be to have to test suites: One for the public API and one that does some extra internal stuff. But this is not super high priority ofc.
README.md
Outdated
@@ -116,26 +116,32 @@ class CountingDelegate: HTTPResponseDelegate { | |||
|
|||
var count = 0 | |||
|
|||
func didTransmitRequestBody() { | |||
// this is executed when request is sent, called once | |||
func didTransmitRequestPart(task: HTTPClient.Task<Response>, _ part: IOData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi @ianpartridge @tanner0101 do you think we need a callback when request head is sent as well? Also, what do you think about naming pattern (transmit/sent)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd probably err on the side of 'more callbacks', @Lukasa can you think of a good use-case for a callback when the head has been sent?
- regarding naming transmit vs. send: I don't mind, think both are fine
- not sure about the
part: IOData
label,part
is maybe a bit unclear? I would almost make it_ part: IOData
, don't think the label adds a lot of value there given the typeIOData
? But this is really not super important :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it's already _ part: IOData
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops :), sorry
Yeah, more callbacks is always better, go for it. |
I'm in general in favour of being able to exert back pressure at as many places as possible, yeah. |
@weissi @Lukasa @ianpartridge @tanner0101 do you have any additional comments on this PR? |
extension HTTPClient { | ||
public struct Body { | ||
public var length: Int? | ||
public var provider: (@escaping (IOData) -> EventLoopFuture<Void>) -> EventLoopFuture<Void> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a struct here would give us more flexibility to add / deprecate methods in the future. It would also allow users of the package to add conformances / extensions.
Something like Body.StreamWriter
or Body.Provider
:
extension HTTPClient.Body {
var writer: (StreamWriter) -> EventLoopFuture<Void>
public struct StreamWriter {
let closure: (IOData) -> EventLoopFuture<Void>
public func write(_ data: IOData) -> EventLoopFuture<Void> {
return self.closure(data)
}
}
public static func stream(length: Int? = nil, _ writer: @escaping (StreamWriter) -> EventLoopFuture<Void>) ->
...
}
}
...
This also reduces the mental load to understand the code a bit.
Here's example usage:
let body: HTTPClient.Body = .stream(length: 50) { stream in
var request = try! Request(url: "http://localhost:\(httpBin.port)/events/10/1")
request.headers.add(name: "Accept", value: "text/event-stream")
let delegate = CopyingDelegate { part in
stream.write(.byteBuffer(part))
}
return httpClient.execute(request: request, delegate: delegate).future
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, refactored, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good @artemredkin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Addressing comments in the client proposal discussion, I would like to propose the following change:
HTTPResponseDelegate.didReceivePart
now returns an optional future that could be used to indicate to the client that reads should be stopped until this future is resolved (backpressure)HTTPClient.Body
is now a struct with optional length and a callback for upload streaming. In addition, there are static methods to keep API the same as before.