Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance of large writes on Kestrel #31110

Open
davidfowl opened this issue Mar 21, 2021 · 4 comments
Open

Investigate performance of large writes on Kestrel #31110

davidfowl opened this issue Mar 21, 2021 · 4 comments
Assignees
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions Perf
Milestone

Comments

@davidfowl
Copy link
Member

davidfowl commented Mar 21, 2021

When testing writing a 1MB chunked response, we spend a considerable amount of time copying buffers (this profile is on windows):

image

The above shows:

  • 20% of the time copying into kernel buffers
  • 10% of the time copying data from the user's buffer into Kestrel's buffer using 4K blocks
  • 2% of the time in GetSpan

Here's what the flow looks like for HTTP/1.1 connections:

HTTPS

  • Socket.SendAsync(...) -> copies buffers into kernel
  • PipeWriter.WriteAsync(...) -> copies buffer then flushes
  • SslStream.WriteAsync(...) -> copies and encrypts individual buffers
  • StreamPipeWriter.FlushAsync() copies buffer then flushes (copy removed in .NET 6)
  • ConcurrentPipeWriter.WriteAsync() (usually passthrough)
  • Http1OutputProducer.WriteAsync(buffer)
  • HttpResponse.Body.WriteAsync(buffer)

HTTP

  • Socket.SendAsync(...) -> copies buffers into kernel buffer
  • PipeWriter.WriteAsync(...) -> copies buffer then flushes
  • ConcurrentPipeWriter.WriteAsync() (usually passthrough)
  • Http1OutputProducer.WriteAsync(buffer)
  • HttpResponse.Body.WriteAsync(buffer)

Packet sizes on the wire look good for both TLS and non-TLS connections:

Non-TLS

image

TLS

image

Code sample:

using System;
using System.Text;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;

var host = new HostBuilder().ConfigureWebHostDefaults(host =>
{
    host.Configure(app =>
    {
        var s = Encoding.UTF8.GetBytes(new string('A', 1024 * 512)).AsMemory();

        async Task Hello(HttpContext context)
        {
            await context.Response.StartAsync();

            var source = s;

            var memory = context.Response.BodyWriter.GetMemory();

            source[..memory.Length].CopyTo(memory);

            context.Response.BodyWriter.Advance(memory.Length);

            source = source[memory.Length..];

            // context.Response.ContentLength = s.Length;
            await context.Response.Body.WriteAsync(source);
        }

        app.UseRouting();

        app.UseEndpoints(routes =>
        {
            routes.MapGet("/", Hello);
        });
    });
}).Build();

await host.RunAsync();
@ghost
Copy link

ghost commented Mar 22, 2021

Thanks for contacting us.
We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@roji
Copy link
Member

roji commented Mar 27, 2021

var s = Encoding.UTF8.GetBytes(new string('A', 1024 * 512)).AsMemory();

As this a typical way to write string responses though? I mean, given a user string, one can encode and write at the same time using Encoding.GetBytes. In other words, one of the copies above is only really needed if writing binary data in the first place, no?

@davidfowl
Copy link
Member Author

Large buffered responses are common because serializers are mostly synchronous (with the exception of System.Text.Json and maybe some others). This results in application code producing a large response that gets buffered directly in application code, then written to the response. There's 2 ways to do this:

  1. Write using the server's memory (PipeWriter/IBufferWriter)
  2. Write the buffer directory to the response

In the second case, we end up copying at multiple layers in the stack. Something I'd like to see us reduce.

@ghost
Copy link

ghost commented Aug 11, 2021

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions Perf
Projects
None yet
Development

No branches or pull requests

6 participants