Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use zero-byte reads in StreamCopier #1415

Merged
merged 5 commits into from
Nov 30, 2021

Conversation

MihaZupan
Copy link
Member

Fixes #1325

I moved all the telemetry-related logic into a separate StreamCopierTelemetry class to both simplify the code and reduce the size of the CopyAsync state machine. Also removed the try-finally blocks to minimize the overhead.

This change amounts to a 5.6% RPS improvement in our base (http-http 100-byte) benchmark.
About 4.6% of which comes the fact we are no longer clearing the entire rented ArrayPool buffer on both sides.
The extra 1% comes from the simplified CopyAsync implementation, even after you account for the extra work we are doing to issue the zero-byte read.

Memory-wise, for a persistent idle connection (e.g. WebSockets), we avoid holding onto the 64k buffer on the request content side. With support from runtime, we also avoid holding the 64k buffer on the response content side + the 32k SslStream buffer.
More detailed numbers here: #1325 (comment).

@MihaZupan MihaZupan added this to the YARP 1.1.0 milestone Nov 29, 2021
@MihaZupan MihaZupan requested a review from Tratcher as a code owner November 29, 2021 12:59
src/ReverseProxy/Forwarder/StreamCopier.cs Outdated Show resolved Hide resolved

private static async ValueTask<(StreamCopyResult, Exception?)> CopyAsync(Stream input, Stream output, StreamCopierTelemetry? telemetry, ActivityCancellationTokenSource activityToken, CancellationToken cancellation)
{
var buffer = ArrayPool<byte>.Shared.Rent(DefaultBufferSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting trade-off. This optimizes for the common case where there is data flowing, but would not get the zero-byte-read benefits until after the first read completes.

How does the perf compare to the alternative of always doing a zero-byte-read first?

Copy link
Member Author

@MihaZupan MihaZupan Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The differences are within the margin of error.

It may be worth just always doing the zero-byte read first to improve the worst-case memory consumption.

On both HttpClient and AspNetCore's side, there is room to special-case zero-byte reads and save a few branches (e.g. skip doing a no-op zero-byte slice+copy+advance on underlying buffers), especially if we know that zero-byte reads are this common (1:1 with regular reads). But for YARP this is in the noise range.

Copy link
Member

@Tratcher Tratcher Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've still rented the buffer in advance. You'd need to remove this line to take advantage of the zero-byte-read.

edit oh, if the zbr does not complete sync then you release the buffer. That's probably an adequate pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may end up renting the buffer and releasing it right away if no data is available on the first read. The overhead of that is negligible (also shouldn't be the common case) and it simplifies the logic for all the other cases.

@sebastienros
Copy link
Contributor

It also has an important effect on big payloads (100K in the following chart). Much more on .NET7 than .NET6 too.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable zero-byte reads in streaming & WebSocket scenarios
3 participants