Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpConnection roots user-provided write buffer in AwaitableSocketAsyncEventArgs #67096

Closed
Euclidite opened this issue Mar 24, 2022 · 25 comments
Closed

Comments

@Euclidite
Copy link

Euclidite commented Mar 24, 2022

Description

My scenario:

  • I have 1000's of small files that need to be individually uploaded to a REST endpoint
  • If I loaded them all up into memory and upload them -> no issue
  • Unfortunately, they're provided individually to be by an external producer and I have that producer fire off an event when the file is read for upload
  • I have found uploading from the delegate creates a memory leak. I've tried wrapping the upload code in a Task.Run in the delegate, but this causes a leak as well
  • I would expect the code to behave the same from inside an event delegate as when being run outside of a delegate

Code to reproduce:

Simple consuming REST endpoint (this can be any endpoint that accepts the payload) (Use deno run --allow-net consume-server.js to run)

import { Application } from "https://deno.land/x/oak/mod.ts";

const app = new Application();

app.use(async (ctx) => {
    // Accept and add some latency to simulate a remote network
    if (ctx.request.method === 'POST' && ctx.request.url.pathname === '/api/bytes') {
        const result = ctx.request.body({ type: 'bytes' });
        await new Promise(res => setTimeout(res, 50));
        const val = await result.value;
        ctx.response.body = `${val.length} bytes received`;
    } else {
        ctx.response.body = "Hello World!";
    }
});

await app.listen({ port: 1777 });

Uploading code:

using System.Net;
using System.Net.Http.Headers;

HttpClient client = new HttpClient {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

// Used to throttle the upload rate (not overload the destination server & avoid port exhaustion)
SemaphoreSlim uploadLock = new(25);
ServicePointManager.DefaultConnectionLimit = 25;

async Task DoUpload(MemoryStream item)
{
    await uploadLock.WaitAsync().ConfigureAwait(false);
    var buffer = item.GetBuffer();
    using var content = new ByteArrayContent(buffer, 0, (int)item.Length);
    content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
    content.Headers.ContentLength = item.Length;
    using var request = new HttpRequestMessage(HttpMethod.Post, "/api/bytes")
    {
        Content = content,
        Version = new Version(2, 0)
    };

    // ResponseHeadersRead because we don't care about the response body - literally just the status code
    var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
    if (response.IsSuccessStatusCode)
    {
        Console.WriteLine("Success!");
    }
    else
    {
        Console.WriteLine("Something went wrong!");
    }

    response.Dispose();
    await item.DisposeAsync().ConfigureAwait(false);
    uploadLock.Release();
}

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream();
    await item.WriteAsync(new byte[sizeBytes], 0, sizeBytes).ConfigureAwait(false);
    await DoUpload(item);
}

async Task<int> Main(string[] args) {
    var producer = new DummyProducer();
    producer.Received += async (_, sizeKb) => {
        // Upload from within a delegate - source of the leak
        // If this were run outside of a delegate it would not leak
        await DoDummyUpload(sizeKb);
    };

    // Simple prompt for testing - enter "upload 1000 64" to see a spike in memory that won't go down
    Console.WriteLine("Waiting for input...");
    while (true)
    {
        var line = await Console.In.ReadLineAsync().ConfigureAwait(false);
        var cmd = line.Split(' ')[0];

        if (cmd == "exit") break;

        if (cmd == "upload") {
            (int numRequests, int payloadSizeKb) = (int.Parse(line.Split(' ')[1]), int.Parse(line.Split(' ')[2]));
            
            Console.WriteLine("Uploading...");
            foreach (var i in Enumerable.Range(0, numRequests))
            {
                producer.emit(payloadSizeKb);
            }
        }
    }

    return 0;
}

await Main(args).ConfigureAwait(false);

class DummyProducer
{
    public event EventHandler<int> Received = delegate { };

    public void emit(int sizeKb) {
        Received.Invoke(this, sizeKb);
    }
}

Configuration

  • Which version of .NET is the code running on? .NET 6.0
  • What OS and version, and for Linux, what distro? Windows 10 (1909 & 21H1)
  • What is the architecture (x64, x86, ARM, ARM64)? x64
  • Do you know whether it is specific to that configuration? Not that I'm aware of
  • If you're using Blazor, which web browser(s) do you see this issue in? N/A

Regression?

Unsure if regression.

Other information

  • No known workarounds other than not uploading in a delegate or any short-lived thread
  • Taking a snapshot of the memory, I think the memory leak is with a "ConcurrentStack+Node<Byte[]> object. VS indicates that there was a cycle detected.
@rzikm rzikm transferred this issue from dotnet/core Mar 24, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Mar 24, 2022
@ghost
Copy link

ghost commented Mar 24, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

My scenario:

  • I have 1000's of small files that need to be individually uploaded to a REST endpoint
  • If I loaded them all up into memory and upload them -> no issue
  • Unfortunately, they're provided individually to be by an external producer and I have that producer fire off an event when the file is read for upload
  • I have found uploading from the delegate creates a memory leak. I've tried wrapping the upload code in a Task.Run in the delegate, but this causes a leak as well
  • I would expect the code to behave the same from inside an event delegate as when being run outside of a delegate

Code to reproduce:

Simple consuming REST endpoint (this can be any endpoint that accepts the payload) (Use deno run --allow-net consume-server.js to run)

import { Application } from "https://deno.land/x/oak/mod.ts";

const app = new Application();

app.use(async (ctx) => {
    // Accept and add some latency to simulate a remote network
    if (ctx.request.method === 'POST' && ctx.request.url.pathname === '/api/bytes') {
        const result = ctx.request.body({ type: 'bytes' });
        await new Promise(res => setTimeout(res, 50));
        const val = await result.value;
        ctx.response.body = `${val.length} bytes received`;
    } else {
        ctx.response.body = "Hello World!";
    }
});

await app.listen({ port: 1777 });

Uploading code:

using System.Net;
using System.Net.Http.Headers;

HttpClient client = new HttpClient {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

// Used to throttle the upload rate (not overload the destination server & avoid port exhaustion)
SemaphoreSlim uploadLock = new(25);
ServicePointManager.DefaultConnectionLimit = 25;

async Task DoUpload(MemoryStream item)
{
    await uploadLock.WaitAsync().ConfigureAwait(false);
    var buffer = item.GetBuffer();
    using var content = new ByteArrayContent(buffer, 0, (int)item.Length);
    content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
    content.Headers.ContentLength = item.Length;
    using var request = new HttpRequestMessage(HttpMethod.Post, "/api/bytes")
    {
        Content = content,
        Version = new Version(2, 0)
    };

    // ResponseHeadersRead because we don't care about the response body - literally just the status code
    var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
    if (response.IsSuccessStatusCode)
    {
        Console.WriteLine("Success!");
    }
    else
    {
        Console.WriteLine("Something went wrong!");
    }

    response.Dispose();
    await item.DisposeAsync().ConfigureAwait(false);
    uploadLock.Release();
}

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream();
    await item.WriteAsync(new byte[sizeBytes], 0, sizeBytes).ConfigureAwait(false);
    await DoUpload(item);
}

async Task<int> Main(string[] args) {
    var producer = new DummyProducer();
    producer.Received += async (_, sizeKb) => {
        // Upload from within a delegate - source of the leak
        // If this were run outside of a delegate it would not leak
        await DoDummyUpload(sizeKb);
    };

    // Simple prompt for testing - enter "upload 1000 64" to see a spike in memory that won't go down
    Console.WriteLine("Waiting for input...");
    while (true)
    {
        var line = await Console.In.ReadLineAsync().ConfigureAwait(false);
        var cmd = line.Split(' ')[0];

        if (cmd == "exit") break;

        if (cmd == "upload") {
            (int numRequests, int payloadSizeKb) = (int.Parse(line.Split(' ')[1]), int.Parse(line.Split(' ')[2]));
            
            Console.WriteLine("Uploading...");
            foreach (var i in Enumerable.Range(0, numRequests))
            {
                producer.emit(payloadSizeKb);
            }
        }
    }

    return 0;
}

await Main(args).ConfigureAwait(false);

class DummyProducer
{
    public event EventHandler<int> Received = delegate { };

    public void emit(int sizeKb) {
        Received.Invoke(this, sizeKb);
    }
}

Configuration

  • Which version of .NET is the code running on? .NET 6.0
  • What OS and version, and for Linux, what distro? Windows 10 (1909 & 21H1)
  • What is the architecture (x64, x86, ARM, ARM64)? x64
  • Do you know whether it is specific to that configuration? Not that I'm aware of
  • If you're using Blazor, which web browser(s) do you see this issue in? N/A

Regression?

Unsure if regression.

Other information

  • No known workarounds other than not uploading in a delegate or any short-lived thread
  • Taking a snapshot of the memory, I think the memory leak is with a "ConcurrentStack+Node<Byte[]> object. VS indicates that there was a cycle detected.
Author: Euclidite
Assignees: -
Labels:

area-System.Net.Http, untriaged

Milestone: -

@MihaZupan
Copy link
Member

How are you determining that a leak is happening?

Can you share what objects are kept in memory when a leak does occur?
E.g. screenshot of numbers & sizes from a dump.

Taking a snapshot of the memory, I think the memory leak is with a "ConcurrentStack+Node<Byte[]> object.

To my knowledge, there is no usage of ConcurrentStack within networking code.

@Euclidite
Copy link
Author

When I run the above, I wait till the uploads complete, memory increases as expecting as the data is queued for processing. However, after the uploads complete it never goes down (I've waited hours before). In the test data above it sits around 260mb, but increase the test numbers and it can easily hit 1GB + of memory usage that never gets released.

Looking at the memory snapshot:
image

When I uncheck "just my code" the object with the most memory usage is a byte array. My feeling here is that it might not be the networking code itself (it works fine when not in a delegate), but perhaps the async state machine?

@Euclidite
Copy link
Author

Some additional info:

Process Memory when performing the upload from the delegate (the upload completed pretty early and I let it sit there for a while)
image

Meanwhile, when I don't await from within a delegate (I await from the main context? not sure what the verbiage is here)
image

You can see a GC event a little bit after each upload is complete - this way the application runs at ~80mb.

@huoyaoyuan
Copy link
Member

With a brief search in https://source.dot.net/#System.Collections.Concurrent/System/Collections/Concurrent/ConcurrentStack.cs,de91cdef3f389289,references , ConcurrentStack<T> is only used by JSON serialization.

Your memory snapshot indicates things more than the minimal repro (Thrift.Protocol etc). Please confirm where comes the concurrent stacks.

@Euclidite
Copy link
Author

Sorry about that! I ran the reproduction off our main app repo - here's the same info with just using the minimal reproduction. I used upload 1000 500 to simulate a large data set. After running this the memory sits at 698.9MB (via Task Manager)

image

image

@Euclidite
Copy link
Author

Another method of reproduction using Task.Run - this results in the exact same memory buildup and memory snapshot
image
image

using System.Net;
using System.Net.Http.Headers;

HttpClient client = new HttpClient {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

// Used to throttle the upload rate (not overload the destination server & avoid port exhaustion)
SemaphoreSlim uploadLock = new(25);
ServicePointManager.DefaultConnectionLimit = 25;

async Task DoUpload(MemoryStream item)
{
    await uploadLock.WaitAsync().ConfigureAwait(false);
    var buffer = item.GetBuffer();
    using var content = new ByteArrayContent(buffer, 0, (int)item.Length);
    content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
    content.Headers.ContentLength = item.Length;
    using var request = new HttpRequestMessage(HttpMethod.Post, "/api/bytes")
    {
        Content = content,
        Version = new Version(2, 0)
    };

    // ResponseHeadersRead because we don't care about the response body - literally just the status code
    var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
    if (response.IsSuccessStatusCode)
    {
        Console.WriteLine("Success!");
    }
    else
    {
        Console.WriteLine("Something went wrong!");
    }

    response.Dispose();
    await item.DisposeAsync().ConfigureAwait(false);
    uploadLock.Release();
}

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream();
    await item.WriteAsync(new byte[sizeBytes], 0, sizeBytes).ConfigureAwait(false);
    await DoUpload(item);
}

async Task<int> Main(string[] args) {
    // Simple prompt for testing - enter "upload 1000 64" to see a spike in memory that won't go down
    Console.WriteLine("Waiting for input...");
    while (true)
    {
        var line = await Console.In.ReadLineAsync().ConfigureAwait(false);
        var cmd = line.Split(' ')[0];

        if (cmd == "exit") break;

        if (cmd == "upload") {
            (int numRequests, int payloadSizeKb) = (int.Parse(line.Split(' ')[1]), int.Parse(line.Split(' ')[2]));

            Console.WriteLine("Uploading...");
            foreach (var i in Enumerable.Range(0, numRequests))
            {
                // Fire and forget Task so that we can do multiple concurrent uploads as the application runs
                Task.Run(async () => await DoDummyUpload(payloadSizeKb));
            }
        }
    }

    return 0;
}

await Main(args).ConfigureAwait(false);

@rzikm
Copy link
Member

rzikm commented Mar 25, 2022

I tried it locally and on my machine, it tops at around 200MB. However, I notice that there is a small problem when e.g. you don't start the server and the call to

var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);

throws an exception, then no new tasks are allowed to run (no calls to SemaphoreSlim.Release) and you get lots of data rooted at the SemaphoreSlim, since they are essentially waiting in the queue (see below)

> gcroot 000002ce8df0c150
Thread acb4:
    000000EC18F7DE90 00007FFC4A993CC2 Program+<>c__DisplayClass0_0+<<<Main>$>g__Main|2>d.MoveNext() [C:\source\DotnetSandbox\Program.cs @ 54]
        rbp+38: 000000ec18f7dec8
            ->  000002CE88F54A38 Program+<>c__DisplayClass0_1
            ->  000002CE8000D548 Program+<>c__DisplayClass0_0
            ->  000002CE80017640 System.Threading.SemaphoreSlim
            ->  000002CE802C1460 System.Threading.SemaphoreSlim+TaskNode
           [repeated hundreds of times]
            ->  000002CE8DFE00A0 System.Threading.SemaphoreSlim+TaskNode
            ->  000002CE8DFE40A0 System.Threading.SemaphoreSlim+TaskNode
            ->  000002CE8DFEA070 System.Threading.SemaphoreSlim+TaskNode
            ->  000002CE8DFEA0C8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<>c__DisplayClass0_0+<<<Main>$>g__DoUpload|0>d, DotnetSandbox]]
            ->  000002CE8DFEA128 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<>c__DisplayClass0_0+<<<Main>$>g__DoDummyUpload|1>d, DotnetSandbox]]
            ->  000002CE8DCA0700 Program+<>c__DisplayClass0_0+<<<Main>$>g__DoDummyUpload|1>d
            ->  000002CE8DCA0750 System.IO.MemoryStream
            ->  000002CE8DF0C150 System.Byte[]

@MihaZupan
Copy link
Member

I believe the issue you are seeing is that the write buffer is not cleared after we write to the socket. The byte[]s continue to be rooted by the AwaitableSocketAsyncEventArgs.

That means the huge byte[]s won't be reclaimed until the connection in the pool expires, or something else writes to the connection.

In your example, if you follow the upload 1000 500 with a upload 1000 1 and force a GC, you should see the memory being reclaimed again.

In general, you should try to avoid allocating such buffers, and instead streaming the data via StreamHttpContent whenever possible.

SemaphoreSlim uploadLock = new(25);
ServicePointManager.DefaultConnectionLimit = 25;

ServicePointManager is no longer used in .NET. Setting this property will do nothing.
Instead of doing your own locking, set the MaxConnectionsPerServer on the handler

HttpClient client = new HttpClient(new HttpClientHandler
{
    MaxConnectionsPerServer = 25
});

@MihaZupan MihaZupan changed the title HttpClient SendAsync leaks when posting ByteArrayContent from a delegate HttpConnection roots user-provided write buffer in AwaitableSocketAsyncEventArgs Mar 25, 2022
@MihaZupan MihaZupan added the bug label Mar 25, 2022
@ManickaP
Copy link
Member

The example is allocating all the buffers at the same time at the beginning of the upload, holding them in the memory, until it can process them 25 at a time.
And if they are 500kB, won't they end up in LOH (I think the limit is ~80kB)?

@MihaZupan
Copy link
Member

MihaZupan commented Mar 25, 2022

A simpler repro:

static async Task<WeakReference<byte[]>> SendRequestAsync(HttpClient client)
{
    var bytes = new byte[1024 * 1024];

    using HttpResponseMessage response = await client.PostAsync("http://localhost:5159", new ByteArrayContent(bytes));
    await response.Content.CopyToAsync(Stream.Null);

    return new WeakReference<byte[]>(bytes);
}

using var client = new HttpClient();

WeakReference<byte[]> bytesReference = await SendRequestAsync(client);

GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();

Console.WriteLine("Alive: " + bytesReference.TryGetTarget(out _));

@Euclidite
Copy link
Author

Thank you for looking into this everyone! To address some of the recent responses...

  • @rzikm I'm mainly using the SemaphoreSlim to help avoid quota limits - but I do understand that essentially causes these objects to queue up in memory - I haven't started thinking about if there's a better alternative to this because I've been looking into this issue for the last few days
  • @MihaZupan Thanks for the tip about StreamHttpContent - I think I tried using it before and might have dismissed it because I either saw an error (used it wrong), or because I still saw the memory growth. Regardless, I just tried it again and it works now (I prefer this much more) - unfortunately the memory leak persists :(
  • @MihaZupan I'll see if I can try out a single upload with a forced GC later today. I have gone as far as trying to collect every single generation after each upload - but the memory never went down. If it does require some sort of delay then this might not be an option for me because I have no control over how frequently / when I receive the objects (I could receive 5000 over a few minutes which would mean memory would shoot up). Do you have any other suggestions how I could force the byte[] to be reclaimed to help mitigate this until a fix is available?
  • @MihaZupan about ServicePointManager - I've run into many sites which still advocate this and recommend avoiding using HttpClient to perform the queueing because it can lead to port exhaustion - is this no longer an issue if I just use MaxConnectionsPerServer ? I just did a quick test with setting this value and disabling my SemaphorSlim - however I can see its starting to upload at a much higher rate, and eventually I start seeing these errors - this might just be the third-party service we're using kicking us off (I don't want to distract from the main issue here - this is from our prod app so the output won't match up to the repro)
System.Net.WebException: An error occurred while sending the request.
 ---> System.Net.Http.HttpRequestException: An error occurred while sending the request.
 ---> System.IO.IOException: The response ended prematurely.
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at System.Net.HttpWebRequest.SendRequest(Boolean async)
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   at System.Net.WebRequest.<>c.<GetResponseAsync>b__68_2(IAsyncResult iar)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location ---
   at Object_Processor.Destinations.Uploader.UploadAsync(Job job) in C:\Users\bart\CodeRepo\Object Processor\Object Processor\Destinations\Uploader.cs:line 228
Failed Send.
System.Net.WebException: The SSL connection could not be established, see inner exception.
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.IO.IOException:  Received an unexpected EOF or 0 bytes from the transport stream.
   at System.Net.Security.SslStream.<FillHandshakeBufferAsync>g__InternalFillHandshakeBufferAsync|187_0[TIOAdapter](TIOAdapter adap, ValueTask`1 task, Int32 minSize)
   at System.Net.Security.SslStream.ReceiveBlobAsync[TIOAdapter](TIOAdapter adapter)
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](TIOAdapter adapter, Boolean receiveFirst, Byte[] reAuthenticationData, Boolean isApm)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(HttpRequestMessage request)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at System.Net.HttpWebRequest.SendRequest(Boolean async)
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   at System.Net.WebRequest.<>c.<GetResponseAsync>b__68_2(IAsyncResult iar)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location ---
   at Object_Processor.Destinations.Uploader.UploadAsync(Job job) in C:\Users\bart\CodeRepo\Object Processor\Object Processor\Destinations\Uploader.cs:line 228
Failed Send.
  • @ManickaP The way the repro is structured is slightly different from my production code. In the production code I have a 3rd party system sending me these options throughout the day (sometimes one every so often, sometimes it hits me with 100's in a second) - when I receive it, I emit an event, and queue try to upload it in the delegate (and then they effectively get queued because of the SemaphoreSlim

@rzikm
Copy link
Member

rzikm commented Mar 25, 2022

@rzikm Radek Zikmund FTE I'm mainly using the SemaphoreSlim to help avoid quota limits - but I do understand that essentially causes these objects to queue up in memory - I haven't started thinking about if there's a better alternative to this because I've been looking into this issue for the last few days

The SemaphoreSlim use is okay, maybe just introduce a try-finally block and put the Release call in the finally block so that you don't leak those slots in case of exceptions.

Anyway, it seems that MihaZupan identified a possible root cause that would need to be fixed on our side

@MihaZupan
Copy link
Member

about ServicePointManager - I've run into many sites which still advocate this and recommend avoiding using HttpClient to perform the queueing because it can lead to port exhaustion - is this no longer an issue if I just use MaxConnectionsPerServer ?

Whatever you set to ServicePointManager.DefaultConnectionLimit will be completely ignored by HttpClient.
The MaxConnectionsPerServer currently applies to any given combination of scheme+host+port. In your example, it looks like you are always sending requests to the same endpoint (host) using the same port, so MaxConnectionsPerServer should work to limit the number of connections as expected.

Thanks for the tip about StreamHttpContent - I think I tried using it before and might have dismissed it because I either saw an error (used it wrong), or because I still saw the memory growth. Regardless, I just tried it again and it works now (I prefer this much more) - unfortunately the memory leak persists :(

Are you able to share the modified code? The way the content is copied is different with StreamContent, so the issue I mentioned about rooting the write buffer shouldn't affect it.

I'll see if I can try out a single upload with a forced GC later today. I have gone as far as trying to collect every single generation after each upload - but the memory never went down. If it does require some sort of delay then this might not be an option for me because I have no control over how frequently / when I receive the objects (I could receive 5000 over a few minutes which would mean memory would shoot up). Do you have any other suggestions how I could force the byte[] to be reclaimed to help mitigate this until a fix is available?

To be clear, I do not recommend you manually force GCs, ever. My example was meant only to demonstrate the existence of an underlying issue.

If what I described above is the issue you are hitting here, there is a way to mitigate it.
As discussed before, StreamContent shouldn't be affected.
Alternatively, you can force HttpClient's connection pool to release any resources held sooner, by setting the PooledConnectionIdleTimeout property.

using var client = new HttpClient(new SocketsHttpHandler
{
    PooledConnectionIdleTimeout = TimeSpan.FromSeconds(10)
});

@CarnaViire
Copy link
Member

avoiding using HttpClient to perform the queueing because it can lead to port exhaustion

You can get port exhaustion if you are not reusing HttpClient instance. The connection pool is inside HttpClient instance (inside HttpClientHandler/SocketsHttpHandler to be more specific). So it it advised to have a single static HttpClient per app to leverage pooling and reusing the connections. The other thing is that by default MaxConnectionsPerServer is unlimited. But if you set it to a finite value, it will do all the request queueing for you.

@Euclidite
Copy link
Author

Euclidite commented Mar 25, 2022

Hi All, sorry for the slow response, hard to rip myself away from meetings & I wanted to test this out before I bring it back to you. I've tried the same reproduction code but with StreamContent, unfortunately it still looks like its leaking something.

(Side note - I'm happy to test out StreamContent, but unfortunately it looks like I can't use it with our application - we're limited by what our 3rd party vendor supports for uploads - ByteArrayContent)

Maybe I'm missing something from staring at this for too long - but here it is:

Updated reproduction code (ran with: upload 1000 500):

using System.Net;
using System.Net.Http.Headers;

HttpClient client = new HttpClient {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

// Used to throttle the upload rate (not overload the destination server & avoid port exhaustion)
SemaphoreSlim uploadLock = new(25);
ServicePointManager.DefaultConnectionLimit = 25;

async Task DoUpload(MemoryStream item)
{
    await uploadLock.WaitAsync().ConfigureAwait(false);
    //var buffer = item.GetBuffer();
    //using var content = new ByteArrayContent(buffer, 0, (int)item.Length);
    using var content = new StreamContent(item); // New!
    content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
    content.Headers.ContentLength = item.Length;
    using var request = new HttpRequestMessage(HttpMethod.Post, "/api/bytes")
    {
        Content = content,
        Version = new Version(2, 0)
    };

    // ResponseHeadersRead because we don't care about the response body - literally just the status code
    var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
    if (response.IsSuccessStatusCode)
    {
        Console.WriteLine("Success!");
    }
    else
    {
        Console.WriteLine("Something went wrong!");
    }

    response.Dispose();
    await item.DisposeAsync().ConfigureAwait(false);
    uploadLock.Release();
}

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream();
    await item.WriteAsync(new byte[sizeBytes], 0, sizeBytes).ConfigureAwait(false);
    await DoUpload(item);
}

async Task<int> Main(string[] args) {
    // Simple prompt for testing - enter "upload 1000 64" to see a spike in memory that won't go down
    Console.WriteLine("Waiting for input...");
    while (true)
    {
        var line = await Console.In.ReadLineAsync().ConfigureAwait(false);
        var cmd = line.Split(' ')[0];

        if (cmd == "exit") break;

        if (cmd == "upload") {
            (int numRequests, int payloadSizeKb) = (int.Parse(line.Split(' ')[1]), int.Parse(line.Split(' ')[2]));

            Console.WriteLine("Uploading...");
            foreach (var i in Enumerable.Range(0, numRequests))
            {
                // Fire and forget Task so that we can do multiple concurrent uploads as the application runs
                Task.Run(async () => await DoDummyUpload(payloadSizeKb));
            }
        }
    }

    return 0;
}

await Main(args).ConfigureAwait(false);

Updated "server" code to handle the stream:

import { Application } from "https://deno.land/x/oak/mod.ts";

const app = new Application();

app.use(async (ctx) => {
    // Accept and add some latency to simulate a remote network
    if (ctx.request.method === 'POST' && ctx.request.url.pathname === '/api/bytes') {
        //const result = ctx.request.body({ type: 'bytes' });
        const result = ctx.request.body({ type: 'stream' });
        await new Promise(res => setTimeout(res, 50));
        const val = await result.value;
        ctx.response.body = `${val.length} bytes received`;
    } else {
        ctx.response.body = "Hello World!";
    }
});

await app.listen({ port: 1777 });

Screengrabs:
image
image

@Euclidite
Copy link
Author

@MihaZupan I've tried your recommendation with:

HttpClient client = new HttpClient(new SocketsHttpHandler
{
    PooledConnectionIdleTimeout = TimeSpan.FromSeconds(10)
}) {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

Unfortunately, it still retains the memory (I tested this by adding PooledConnectionIdleTimeout to the the previous reproduction code):
image

@MihaZupan
Copy link
Member

image
It looks like your DoUpload is not completing at all here.

Please apply the change Radek recommended above and move the uploadLock.Release() to a finally block.

await uploadLock.WaitAsync().ConfigureAwait(false);
try
{
    // Do the upload
}
finally
{
    uploadLock.Release();
}

(Side note - I'm happy to test out StreamContent, but unfortunately it looks like I can't use it with our application - we're limited by what our 3rd party vendor supports for uploads - ByteArrayContent)

You are limited in what the server will accept, or does the 3rd party control the APIs around HttpClient and literally only accepts a ByteArrayContent instance?

@Euclidite
Copy link
Author

To answer your question @MihaZupan - I'm limited to what the server will accept - their endpoint won't accept stream content (it returns an error code right away when I try to do so).

I made the recommended change and re-ran it for much longer - I thought the "upload" was completed (I'm testing this against my dummy local endpoint still). With StreamContent some things of note:

  • I've actually run this much longer than 23s, however during this upload it seems the diagnostic tools keep locking up until the upload is done - this might be an environmental issue (but I want to share everything I know)
  • The memory only drops to 436mb - I'd expect it to drop much more
  • From the memory snapshot, I can't exactly see what might be taking up the memory - maybe it's already pending GC in the future? (I'm no expert here, but I feel like that's a lot of memory to wait on)
    image

With ByteArrayContent my observations:

  • Same issue with the diagnostic tool - again could just be an environmental issue
  • The memory never drops, I took two snapshots - one right after the upload was none, and then another a couple minutes later. You can see memory drops only by 15kb
  • Again, looking at the memory snapshot it's not obvious to me what would even be taking up the memory
    image

My test code (I just comment out / uncomment the StreamContent vs ByteArrayContent lines)

using System.Net;
using System.Net.Http.Headers;

HttpClient client = new HttpClient(new SocketsHttpHandler
{
    PooledConnectionIdleTimeout = TimeSpan.FromSeconds(10)
}) {
    BaseAddress = new Uri("http://localhost:1777"),
    DefaultRequestVersion = HttpVersion.Version20,
};

// Used to throttle the upload rate (not overload the destination server & avoid port exhaustion)
SemaphoreSlim uploadLock = new(25);

async Task DoUpload(MemoryStream item)
{
    await uploadLock.WaitAsync().ConfigureAwait(false);
    try
    {
        using var content = new ByteArrayContent(item.GetBuffer(), 0, (int)item.Length);
        //using var content = new StreamContent(item);
        content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
        content.Headers.ContentLength = item.Length;
        using var request = new HttpRequestMessage(HttpMethod.Post, "/api/bytes")
        {
            Content = content,
            Version = new Version(2, 0)
        };

        // ResponseHeadersRead because we don't care about the response body - literally just the status code
        using var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
        Console.WriteLine(response.IsSuccessStatusCode ? "Success!" : "Something went wrong!");
    } finally
    {
        await item.DisposeAsync().ConfigureAwait(false);
        uploadLock.Release();
    }
    
}

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream();
    await item.WriteAsync(new byte[sizeBytes], 0, sizeBytes).ConfigureAwait(false);
    await DoUpload(item);
}

async Task<int> Main(string[] args) {
    // Simple prompt for testing - enter "upload 1000 64" to see a spike in memory that won't go down
    Console.WriteLine("Waiting for input...");
    while (true)
    {
        var line = await Console.In.ReadLineAsync().ConfigureAwait(false);
        var cmd = line.Split(' ')[0];
        if (cmd == "exit") break;
        if (cmd == "upload") {
            (int numRequests, int payloadSizeKb) = (int.Parse(line.Split(' ')[1]), int.Parse(line.Split(' ')[2]));

            Console.WriteLine("Uploading...");
            foreach (var i in Enumerable.Range(0, numRequests))
            {
                // Fire and forget Task so that we can do multiple concurrent uploads as the application runs
                Task.Run(async () => await DoDummyUpload(payloadSizeKb));
            }
        }
    }

    return 0;
}

await Main(args).ConfigureAwait(false);

Please let me know if there's anything else you'd like me to try. Trying to resolve this, or find a viable workaround is my priority now.

@MihaZupan
Copy link
Member

The GC generally won't do work unless it has to. If the system has a ton of memory to spare and the process is sitting idle, it has no need to do anything.

Sitting at 400 MB isn't a memory leak in itself. To get more interesting numbers about how much memory is actually being kept alive, try adding this at the start of your test:

_ = Task.Run(async () =>
{
    while (true)
    {
        Console.Title = $"{GC.GetTotalMemory(forceFullCollection: true) / 1024f / 1024f:N1} MB";
        await Task.Delay(1000);
    }
});

Looking into it a bit more, StreamContent will have the same problem if you are feeding it a MemoryStream.

The aim is to never allocate the byte[] in the first place. It's hugely inefficient. If your input is a Stream, use that in StreamContent directly instead of buffering everything in MemoryStream.

their endpoint won't accept stream content (it returns an error code right away when I try to do so).

Your StreamContent test is currently broken. The MemoryStream you create in DoDummyUpload won't return any content when read since you never rewind it after the WriteAsync.
Create it like this instead:

async Task DoDummyUpload(int sizeKb) {
    var sizeBytes = 1024 * sizeKb;
    var item = new MemoryStream(new byte[sizeBytes]);
    await DoUpload(item);
}

If you have to buffer everything for some reason, try using a CustomStreamContent.cs instead.

@Euclidite
Copy link
Author

Euclidite commented Mar 28, 2022

The GC generally won't do work unless it has to. If the system has a ton of memory to spare and the process is sitting idle, it has no need to do anything.
However, when I call await without using Task.Run or a Delegate, I can see that memory usage stays low, and I can see regular GC events clearing the memory - should there be a discrepancy? (Let me know if you'd like some reproduction code)

I've added your piece of memory monitoring code to the last code I posted (ByteArrayContent), and here's a summary of the memory:

  • Before I start running the upload
    image

  • During upload - I can see the memory reported on the console going down slowly. The diagnostic tools stay frozen, and memory in task manager never goes down. The task manager memory only suddenly drops after the upload is complete
    image

  • After the upload - The memory drops down, but nowhere close to before the uploads (I understand that a bit higher is expected) - however the memory doesn't drop down nearly as much as if I just did an await from main (without Delegates or Task.Run)
    image

Thank you for pointing out my error with using StreamContent - I embarrassingly completely missed rewinding it (I thought StreamContent would do it for me for some reason). I am now able to successfully use StreamContent directly 😊

The way I structured the reproduction code was to provide something similar to how our production app works, if you're interested in the context:

  • We use a 3rd party library which provides us objects
  • These objects have a SaveAsync() method that can only write out to a memory stream
  • I was using GetBuffer() because I thought StreamContent wasn't working with our service - I've now removed these calls thanks to your tip about rewinding the buffer

Given the above, I have tried out your CustomStreamContent example (thank you!), I am able to do a SaveAsync(stream) inside the SerializeToStreamAsync method, however it oddly causes the CPU usage to skyrocket to 50%+, and limits it to sequential uploads. If you want I can provide more reproduction code for this - however I don't want to hijack this issue with something that may be unrelated.

I don't suppose you have any alternate ideas/suggestions to keep release the memory since both ByteArrayContent and StreamContent have this issue? I've tried disposing HttpClient (I know this is unadvisable) between calls - but no luck.

@MihaZupan
Copy link
Member

If you have a method like Task SaveAsync(Stream destination, CancellationToken ct), then using a custom HttpContent that does the following would be best performance-wise:

protected override async Task SerializeToStreamAsync(Stream stream, TransportContext? context, CancellationToken cancellationToken)
{
    await _yourSourceObject.SaveAsync(stream, cancellationToken);
}

I am able to do a SaveAsync(stream) inside the SerializeToStreamAsync method, however it oddly causes the CPU usage to skyrocket to 50%+, and limits it to sequential uploads

If what you were trying out is different than the above, feel free to share the code. This approach being more expensive than pre-buffering everything into a byte[] isn't expected.

however the memory doesn't drop down nearly as much as if I just did an await from main (without Delegates or Task.Run)

I don't know what your approach was with "just an await from main", but my guess is that you were running all uploads serially, so you were only allocating one buffer at a time, and only sending one request at a time.
From the screenshot it looks like the GC was able to collect all the memory from the process. The GC will opportunistically avoid immediately handing unused pages back to the OS to improve performance. It doesn't look like the process is consuming unreasonable amounts of memory, so this part seems as a non-issue to me right now.

@karelz
Copy link
Member

karelz commented Mar 29, 2022

Triage:

  • The scenario could be handled in better way - via streaming
  • We could still choose to optimize Sockets to null the write buffer after its use. However, it will benefit only scenarios which are already suboptimal. Not worth the effort, closing.

@karelz karelz closed this as completed Mar 29, 2022
@karelz karelz added this to the 7.0.0 milestone Mar 29, 2022
@Euclidite
Copy link
Author

@MihaZupan @karelz

I'm not sure why this issue was closed. A few additional notes I'd like to point out:

  • I'd like to point out that after being provided with the example to properly use StreamContent - I have been using it and I still see a large usage of memory.
  • My system has 32gb of ram. When I run my test code with upload 1000 9999 - the ram usage jumps to ~15GB (as expected since the bytes are all in memory), and then only goes down to 11GB on task manager (~5GB shown by Console.Title = $"{GC.GetTotalMemory(forceFullCollection: false) / 1024f / 1024f:N1} MB";)
    image
  • When I use Console.Title = $"{GC.GetTotalMemory(forceFullCollection: true) / 1024f / 1024f:N1} MB"; (effectively forcing GC), the process memory still jumps to ~15GB (expected), but then goes all the way back down to 40-170mb
    image

My concern here is:

  • This issue is effectively preventing a good use of resources if we're trying to upload files/content in parallel
  • On resource constrained systems (or in containers) where we might be billed by retaining memory for longer periods / it makes it difficult to know when to spin up more instanced to handle load - the GC retaining huge amounts of memory doesn't make sense.

Are there any good alternatives here? is it possible to null the buffer manually from my code?

@MihaZupan I'll try to get back to you about saving to the network stream directly afterwards if you're willing to take a quick look. I'd like any effort right now to go into the above...

@MihaZupan
Copy link
Member

MihaZupan commented Mar 30, 2022

(effectively forcing GC), the process memory still jumps to ~15GB (expected), but then goes all the way back down to 40-170mb

So if you force a full GC, effectively all of the memory is collected.
This means you are no longer hitting the issue discussed here ("HttpConnection roots user-provided buffer"), but rather an issue with how aggressive the GC is about reclaiming memory in this case.

This issue is effectively preventing a good use of resources if we're trying to upload files/content in parallel

I think the example here is a bit contrived when looking at memory usage. Buffering 15 GB worth of content in the process before sending it is not a good use of resources and is something you should avoid in production code.

I'll try to get back to you about saving to the network stream directly afterwards if you're willing to take a quick look. I'd like any effort right now to go into the above...

I think this is the important discussion here. What does the memory usage look like in your real app if you avoid allocating the buffers and instead stream the content directly? Does it retain unreasonable amounts of memory without forcing a full GC?

On resource constrained systems (or in containers) where we might be billed by retaining memory for longer periods / it makes it difficult to know when to spin up more instanced to handle load - the GC retaining huge amounts of memory doesn't make sense.

Please take a look at #50902 which is discussing how you may tell the GC to prefer low memory usage over collection times.

My guess is that the memory is being retained because you are using huge buffers that immediately go to the LOH (large object heap), and won't be collected unless there is a good reason to do so (e.g. running out of memory).

That is, your scenario is aggressively allocating short-lived buffers that exceed the LOH threshold. This is a performance anti-pattern and it won't be fast.

If you for some reason can't avoid allocating such buffers (you most likely can), then you can look at tweaking GC settings like the LOH threshold or changing the conserve-memory flag.

You can also file a new issue to discuss the performance of such scenarios and get input from people with more expertise in the area (label area-GC-coreclr in this repo).

If your application is not otherwise under high CPU load and all you want is to minimize memory usage, you could also consider inducing a full GC yourself. This is not something we would generally recommend, but may be effective in your scenario.
E.g.

Task.Run(async () =>
{
    while (true)
    {
        if (current process memory > some reasonable threshold for your scenario)
        {
            GC.Collect();
        }
        
        await Task.Delay(TimeSpan.FromMinutes(1));
    }
});

@ghost ghost locked as resolved and limited conversation to collaborators Apr 29, 2022
@jeffhandley jeffhandley removed the untriaged New issue has not been triaged by the area owner label May 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants