-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the cost of flushing #265
Conversation
643e3e7
to
0bedded
Compare
@@ -198,9 +198,14 @@ class SimpleClientServerFramePayloadStreamTests: XCTestCase { | |||
/// Establish a basic HTTP/2 connection. | |||
func basicHTTP2Connection(clientSettings: HTTP2Settings = nioDefaultSettings, | |||
serverSettings: HTTP2Settings = nioDefaultSettings, | |||
maximumBufferedControlFrames: Int = 10000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file makes these tests run faster: I was sick of waiting for them.
## | ||
## SPDX-License-Identifier: Apache-2.0 | ||
## | ||
##===----------------------------------------------------------------------===## |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a helper script I've added to do broader-scale cachegrind comparison benchmarking.
bc967c0
to
d6d58fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good.
|
||
if oldWindowSize <= 0 && self.currentWindowSize > 0 { | ||
// Window opened. We can now write. | ||
return .changed(newValue: true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check I understand what's going on here: in the case that new window size isn't large enough for all the data we have buffered, we still say we're writable, but this is in the sense of "we have data that can be written on the connection" rather than "we have enough space for the stream channel to produce more writes".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is correct. The writability notion here is entirely regarding the HTTP/2 flow control window, and has nothing to do with the stream channel itself. Nothing in this object handles stream channel logic in any way.
self.dataBuffer.failAllWrites(error: error) | ||
} | ||
|
||
mutating func nextWrite(maxSize: Int) -> (DataBuffer.BufferElement, WritabilityState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this should only be called if we're marked, worth a comment/assertion to that end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like a good idea to me.
Motivation: The OutboundFlowControlBuffer is responsible for managing DATA and HEADERS frames for all streams in order to ensure that we obey HTTP/2 flow control rules, even when users don't. This object is therefore on the data path for the vast majority of frames, and is extremely performance sensitive. Unfortunately, the current algorithm has some nasty performance cliffs. Right now it has the awkward property of making flushes take an amount of time linear in the number of streams on the connection, not the number of streams that had had frames sent on them. The result is that connections with highly parallel numbers of streams see extremely poor performance when flushing, a problem that is exacerbated if other parts of the HTTP/2 stack fail to coalesce those flushes. We can clean this up in a few ways, but the easiest way is to just keep track of which streams have done I/O since the last flush. While we're there, we can make a few other optimisations to ensure we only flush data on streams we are actually processing, to further reduce the risk of wasting our time processing unnecessary work. Modifications: - Add a Set to keep track of where we've done writes since the last flush. - Streams whose flow control window goes to zero stop being writable. - Cheaper and more accurate calculation of zero-length writes, which avoids the possibility of awkward stalls with zero-length DATA frames and END_STREAM. - Stop doing math about the number of flushed bytes, we don't use that information any more. Results: An approximately 10% performance gain on the 100 concurrent streams server-only benchmark. The relative performance gain shifts with the number of active streams, but it should be positive in all cases.
d6d58fe
to
27e8d6d
Compare
Motivation:
The OutboundFlowControlBuffer is responsible for managing DATA and
HEADERS frames for all streams in order to ensure that we obey HTTP/2
flow control rules, even when users don't. This object is therefore on
the data path for the vast majority of frames, and is extremely
performance sensitive.
Unfortunately, the current algorithm has some nasty performance cliffs.
Right now it has the awkward property of making flushes take an amount
of time linear in the number of streams on the connection, not the
number of streams that had had frames sent on them. The result is that
connections with highly parallel numbers of streams see extremely poor
performance when flushing, a problem that is exacerbated if other parts
of the HTTP/2 stack fail to coalesce those flushes.
We can clean this up in a few ways, but the easiest way is to just keep
track of which streams have done I/O since the last flush. While we're
there, we can make a few other optimisations to ensure we only flush
data on streams we are actually processing, to further reduce the risk
of wasting our time processing unnecessary work.
Modifications:
flush.
avoids the possibility of awkward stalls with zero-length DATA frames
and END_STREAM.
information any more.
Results:
An approximately 10% performance gain on the 100 concurrent streams
server-only benchmark. The relative performance gain shifts with the
number of active streams, but it should be positive in all cases: no regression
is observed on the 1 concurrent stream equivalent test.