-
Notifications
You must be signed in to change notification settings - Fork 851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use zero-byte reads in StreamCopier #1415
Conversation
|
||
private static async ValueTask<(StreamCopyResult, Exception?)> CopyAsync(Stream input, Stream output, StreamCopierTelemetry? telemetry, ActivityCancellationTokenSource activityToken, CancellationToken cancellation) | ||
{ | ||
var buffer = ArrayPool<byte>.Shared.Rent(DefaultBufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting trade-off. This optimizes for the common case where there is data flowing, but would not get the zero-byte-read benefits until after the first read completes.
How does the perf compare to the alternative of always doing a zero-byte-read first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The differences are within the margin of error.
It may be worth just always doing the zero-byte read first to improve the worst-case memory consumption.
On both HttpClient and AspNetCore's side, there is room to special-case zero-byte reads and save a few branches (e.g. skip doing a no-op zero-byte slice+copy+advance on underlying buffers), especially if we know that zero-byte reads are this common (1:1 with regular reads). But for YARP this is in the noise range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've still rented the buffer in advance. You'd need to remove this line to take advantage of the zero-byte-read.
edit oh, if the zbr does not complete sync then you release the buffer. That's probably an adequate pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may end up renting the buffer and releasing it right away if no data is available on the first read. The overhead of that is negligible (also shouldn't be the common case) and it simplifies the logic for all the other cases.
Fixes #1325
I moved all the telemetry-related logic into a separate
StreamCopierTelemetry
class to both simplify the code and reduce the size of theCopyAsync
state machine. Also removed the try-finally blocks to minimize the overhead.This change amounts to a 5.6% RPS improvement in our base (http-http 100-byte) benchmark.
About 4.6% of which comes the fact we are no longer clearing the entire rented ArrayPool buffer on both sides.
The extra 1% comes from the simplified
CopyAsync
implementation, even after you account for the extra work we are doing to issue the zero-byte read.Memory-wise, for a persistent idle connection (e.g. WebSockets), we avoid holding onto the 64k buffer on the request content side. With support from runtime, we also avoid holding the 64k buffer on the response content side + the 32k SslStream buffer.
More detailed numbers here: #1325 (comment).