-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dotnet core consuming lot of memory? #79287
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/gc Issue DetailsIs there an existing issue for this?
Describe the bugTeam, We are running 27 dotnet core based pods in our Azure kubernetes environment. The memory usage was fine until we were on .net core 3.x, but since the day we've migrated to .net 6.0.9, the memory consumption of the dotnet process across nodes is huge. As shown in the screenshot, I ran Could you please suggest some tips and strategies to reduce the memory consumption of .net core? Please help.. Thank you. Expected BehaviorLess memory consumption Steps To Reproduce
Exceptions (if any)NA .NET Version6.0.9 Anything else?No response
|
cc @richlander |
Did you try Server GC setting ? |
the first step is always to capture a trace to see why your memory grew. if you could share a trace for 3.1 for comparison purpose that'd be best. you can use the dotnet trace tool -
this should include the history that's leading up to the memory growth. so if you can repro this fairly quickly, you can just start the command right before you start your process. this trace contains no PII how to capture a trace in containers is described here. please share the resulting trace. |
@Maoni0 , Thanks you so much for helping. Please correct me if I'm wrong. You want the .net trace of any of our custom applications running as containers in the cluster with .net 6 and .net 3.1 build. Am I correct? If yes, then what should be the duration? Because the screenshot I have shared is of processes (dotnet runtime) running in Azure kubernetes nodes after we have migrated all our pods to .net 6. Kind Regards, |
@smartaquarius10 you are welcome. what I meant was if it's possible for you to run the same workload with 3.1 and 6.0 and capture a trace for each, that would give us a fair comparison. you mentioned you have 27 .net based pods, I am not sure if these all run the same thing or not, and/or if all 27 pods started seeing mem increase. but let's say one of your 27 pods always runs task X and you observe its memory usage increase the most when you upgraded from 3.1 to 6, would it be possible to run that pod (with the same workload) with 3.1 and 6? if it's too difficult to revert it back to 3.1, it's fine. a trace on 6.0 would be helpful too. it would help us understand your 6.0 memory behavior, but we can't tell things like "why in 3.1 it consumed less memory" without a 3.1 trace. when you upgrade from one major version to the next, there are a ton of things that change in the libraries and the runtime (in the runtime aside from GC changes, of course a ton of other things have changed). so we can't tell if the higher memory usage is because there's simply higher demand in memory from the libraries and the rest of the runtime, or if it's because of changes in the GC. |
@Maoni0 , Good morning. Hope you are doing well. I have created a new repository and uploaded the zip file in it with the traces. The duration is 10 seconds. As I'm unaware of the security, hence created the private repo and provided you the complete access on the repository. Its awaiting your confirmation. Shall be grateful if you can approve the request and download it from here Thank you so so much once again for helping us.. Really appreciate. Kind Regards, |
hi Tanul, thanks for your traces. unfortunately these didn't capture any GC events. the 3.1 one shows this in EventStats view in Perfview - could you please share the commandline you used and when you started it and how long you ran it for? also CC-ing @mangod9 as I'll be OOF starting tomorrow. |
@Maoni0, Hey, thank you so much for analyzing the traces. This is the procedure I followed:
Enjoy your holiday and have a great weekend 😃 @mangod9 , Hope you're doing well. I have provided you the access as well on the repository. Its awaiting for your confirmation 😃 Kind Regards, |
@mangod9 , Hello Manish, Good morning.. Could you please help us on this issue. Would be grateful. |
hi @smartaquarius10, could you please add @cshung to the traces as well? thanks |
I have seen the traces and I agree with @Maoni0's assessment. The traces don't have any GC-related events there and are not useful for our investigation. The command line looks correct though - I wonder why that happened - maybe a GC is not happening within 10 seconds? Generally, a GCCollect-only trace is very lightweight, it emits only a handful of events per GC, so you can leave it on for a much longer period of time than just 10 seconds. Can you try to leave it on for an extended period of time? Ideally, it would be nice to capture them side-by-side when the heap size grows. That way we can investigate what caused the growth. |
Hmm, I am suspecting something might be wrong with our trace capture tools (or maybe the trace capturing process), I don't know what is that yet. In the .net 6 traces, we are capturing a few GC finalizer events, meaning GC is happening, but we aren't capturing any statistics about the GC heap, which doesn't look right. |
@cshung, oh ok. Is there any other tool available to capture these traces? |
I spent some time today trying to figure out what is going on with // Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System;
using System.Reflection;
using System.Runtime;
using System.Diagnostics;
namespace CoreLab
{
internal static class Program
{
private static void Main()
{
// This ensures I have time to attach dotnet-trace to it.
Console.ReadLine();
// Here is how I can trigger the FitBucket event
GC.Collect(
/* generation = */ 1,
/* mode = */ GCCollectionMode.Forced,
/* blocking = */true);
// Here is how I can trigger the GCLOHCompact event
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
}
}
} I ran this program under the latest dotnet runtime that I built out of main, and then I attach dotnet-trace to it with these arguments:
For the dotnet-trace, I used to one in the With that, I am able to generate a trace with the appropriate events: This works with both Windows and Linux Ubuntu. I have a hard time getting this to work on Alpine, so I haven't tested Alpine specifically. Mostly just because I am unfamiliar with that platform myself. If you can try out the experiment I outlined above on Alpine (you can use whatever build of the software you want, no need to build it yourself) and see if dotnet-trace can capture GC events if you know for sure that GC does happen, that would be great. |
@smartaquarius10, yes, please. I don't know your app, but for the c# code I show, there must be a GC because I forced it, so we can certainly rule out the possibility that a GC didn't happen when you capture a trace. |
@cshung, Here are the traces with GC code. |
The latest traces are good, and the GC events are there, showing the tool is working just fine. |
hi @smartaquarius10, I've added your case to mem-doc in the 1st FAQ "I didn't change my code at all, why am I seeing a regression in memory when I upgrade my .NET version?". could you tell me if that's helpful? |
@smartaquarius10 -- We can also setup a call to work through some of this in real time together. You can contact me at rlander@ms if you want to discuss that. |
@Maoni0 , Thank you so much. Will go through that. @richlander, Thanks a lot for helping. Will go through the details which Maoni shared and ping you on teams after that.. Thank you so much once again. Really appreciate for the all help and support :) Kind Regards, |
@Maoni0 @richlander @cshung , |
the cgroup v2 support is in 6.0: https://github.com/dotnet/runtime/blob/release/6.0/src/coreclr/pal/src/misc/cgroup.cpp#L39. do you happen to have a dump? if so you could check if the hardlimt is set. it's gc_heap::heap_hard_limit. |
@Maoni0 , Thank you for the prompt reply. Sorry, I don't know how to get that 😢 Could you please guide through the process of collecting the dump or these values. Would be grateful. Thank you. Kind Regards, |
I searched for ".net core dump" on bing and this is the 2nd link that came up, can you see if this is helpful? https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dumps if not, we should improve our docs. |
Any update on this issue? I'm experiencing the same issue - 40% increase of memory footprint after migration from .net 3.1 to 6.0. |
If its running in AKS may be this issue can help |
Is there an existing issue for this?
Describe the bug
Team,
We are running 27 dotnet core based pods in our Azure kubernetes environment. The memory usage was fine until we were on .net core 3.x, but since the day we've migrated to .net 6.0.9, the memory consumption of the dotnet process across nodes is huge.
As shown in the screenshot, I ran
top
command in one AKS node and then sorted them in decreasing order of RAM. All the dotnet resources are sorted to the first place. The same situation is there within all the 8 nodes in our AKS node pool.Could you please suggest some tips and strategies to reduce the memory consumption of .net core?
Please help.. Thank you.
Expected Behavior
Less memory consumption
Steps To Reproduce
top
commandShift+M
to sortExceptions (if any)
NA
.NET Version
6.0.9
Anything else?
No response
The text was updated successfully, but these errors were encountered: