-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.NET 6 Memory Leak/Issue #38722
Comments
Out of curiosity, is the memory released if you force a GC? I believe dotMemory has an option to do that. |
@pranavkm The app is running in release/production mode and dotMemory is not available on the server. Although it is not advised, would running System.GC.Collect() (https://docs.microsoft.com/en-us/dotnet/api/system.gc.collect?view=net-6.0#System_GC_Collect) have the same effect you want? |
Is the leak coming from those |
@davidfowl I believe so... The main byte[] that is taking up 5gigs is pinned via "OverlappedData" from FileSystemWatcher. See screenshot. I'm kind of new to dotMemory, so if tell me how to navigate to what you're looking for I can do so (I can also zoom/team/share screen - whatever is easiest for your team). |
That memory is pinned so it's not gonna go away with a GC. We need to figure out who is creating those. The best way to go about this would be to capture an allocation trace in production. It'll show you the stack that is causing these allocations. https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-memory-leak (though this is just the dump). Where is this file system watcher rooted? |
@davidfowl I read your first link. I've put PerfView on the machine. If I'm understanding this correctly I should be able to just run
On the production server to capture the allocation trace? (I ran a quick sample and it generates a PerfViewGCCollectOnly.etl.zip file.) I don't need to target the process ID (w3wp.exe) or anything? In regards to the dump, is the dotnet-dump preferred over the normal windows memory dump? |
No it collects the entire machine. If you use dotnet trace it'll require the pid
Yes, you're trying to figure out where the allocations are coming from.
Either is fine. |
Unfortunately, the 12 hour PerfView collection I had didn't go well. When I opened it up in PerfView, it was missing the "“GC Heap Alloc Ignore Free (Coarse Sampling)” section. I don't know if it was because of the length of recording or file size. When I sample for say 30 minutes, it's available. I did notice that after 7 hours, the size stayed at "6,300MB" exactly which was strange, but I let it keep running. The process grew from 2gigs to about 11.5gigs over that 12 hours span during the night (theoretically our lower traffic time). I've started a trace again this time using PerfView64 just in case and the /FocusProcess= argument when collecting to focus on the process. I'll report back in a few hours. I didn't want to do an app pool reset yet, so I've started the recording while the current process is sitting at 12gigs |
Started the perfview monitoring at ~9:30AM EST when the process w3wp.exe was already at 12G (it had grown from 2G to 12G overnight). I recorded for 2.5 hours with the following command.
Breakdown dotnet-dump (high level, I have more screenshots if you need) Thoughts
|
I decided to experiment on our staging website various endpoints to see if I could simulate the leak in a more controlled environment. I made a basic controller with various actions to test various resources. I used Locust (https://locust.io/) with 3 workers to simulate 200 users making requests simultaneously. The staging application is identical to the production one on the same server / same IIS settings (just a different domain). I first tested on my local dev machine, and then the staging website (after each action I would reset the app pool just in case). The controller for the requests was the following. There was no special action filters, pre-controller processing, etc.
Round 1Environment:Windows Pro 10, SQL Server 2019 Web Edition, Intel i7, 32 GB RAM Action: TestNormalResponse, TestBgProcess, TestDbQuery, TestDbQueryAndDtoMemory usage caps out at around 380 MB on all endpoints being blasted. I attached dotMemory and watch the GC kick in pretty regularly. Locust note: in debug mode on my local dev machine the app was only handling ~30 request per second / RPS Environment:Windows Server 2019, SQL Server 2019 Web Edition, Intel Xeon Gold 6226R x2, 64 GB RAM Action: TestNormalResponseMemory quickly spiked and then memory usage caps out at around ~1,210 MB and periodically GC would kick in and bring it back down to ~1,175 MB. Ran for about 5 minutes. Stopped requests and I waited about 10 minutes, memory never went back down below 1,170 MB. Locust note: server was able to handle about ~2300 RPS Action: TestBgProcessMemory quickly spiked and then memory usage caps out at around ~1,200 MB and periodically GC would kick in and bring it back down to ~1,180 MB. Ran for about 5 minutes. Stopped requests and I waited about 10 minutes, memory never went back down below 1,1600 MB. Locust note: server was handling about ~2200 RPS Action: TestDbQuery (here is where things were interesting)Memory quickly spiked to 3,100 MB and then continuously went up 70-100 MB per second (!). Once the app got to about 7,500 MB it then continued to grow 5-25 MB per second (random spikes in between) and would not let up. After about 2 minutes it grew to 16,000 MB and I eventually had to stop because I didn't want to overload the server. I waited for 30 minutes hoping the GC would kick in and bring it way back down but never did. I took a memory dump to see what was going on: I eventually called System.GC.Collect() via code (GC.GetTotalMemory(false) before: 5308489584, GC.GetTotalMemory(true) after: 3764229944). No effect. No movement in memory after 10 mins. Locust note: server was able to handle about ~1500 RPS, a pretty significant drop after DB / EF core usage was introduced - expected. Action: TestDbQueryAndDtoDidn't even bother testing as the DB endpoint was enough to cause issues. Round 2Wanted to see what would happen if I turned hostingModel="OutOfProcess" Environment:Windows Server 2019, SQL Server 2019 Web Edition, Intel Xeon Gold 6226R x2, 64 GB RAM Action: TestDbQueryProcess now showed up as dotnet.exe (expected). Memory quickly spiked to 4,100 MB and then continuously went up 50-100 MB per second. Once the app got to about 6,000 MB it then continued to grow 5-50 MB per second (random spikes in between) and would not let up. It did go up noticably slower than the InProcess verion, but after about 4 minutes it grew to 15,200 MB and I eventually had to stop because I didn't want to overload the server. I waited for 30 minutes hoping the GC would kick in and bring it way back down but never did. Locust note: server was able to handle about ~1400 RPS (slightly less than InProcess version) Round 3Wanted to see what would happen if I turned to false (Workstation GC) Environment:Windows Server 2019, SQL Server 2019 Web Edition, Intel Xeon Gold 6226R x2, 64 GB RAM Action: TestDbQueryMemory went to about 700 MB after 30 seconds (significanntly less). Once the app got to about 1,200 MB it then continued to grow 2-10 MB per second (random spikes in between). I let the test run for a solid 6 minutes at full blast and it "only" got to 5,000 MB. That being said it was still growing (although slow), so had I left it run longer it probably would have continued to go up. After I stopped running the request I let it sit to see if the GC would recover but it never did. Locust note: server was able to handle about ~470 RPS (significant drop! but to be expected since Server GC allows for higher throughput) Thoughts:
|
For what it's worth, I did apply both DOTNET_GCHighMemPercent (65%) and DOTNET_GCHeapHardLimitPercent (68%) while performing another blast test. Once the box got to about 60% of physical memory I did notice the server was processing significantly less requests per second (went from 2,500 RPS to about 400), which I could only assume was GC kicking in (good). It did slow the growth but I left it go for an extra 10 minutes and it still went through both of those limits and got up to 75% memory before I stopped. |
If you have the time, I'd recommend reading through https://github.com/Maoni0/mem-doc/blob/master/doc/.NETMemoryPerformanceAnalysis.md because it explains how to analyze managed memory and allocations and might explain the behavior you're seeing |
Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue. This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue! |
This was resolved offline. |
@davidfowl We are having a similar issue, in our case its a .Net 5 application. Could you possible post what the solution was that was? |
Could someone please point to the resolution of this? We are running into a similar memory leak issue, esp when using reflection on the .Net Core 5 application. It may not be related but it would be good to know how you resolved your issue @davidfowl Thanks much - your help is appreciated. |
Describe the bug
We're noticing a memory issue running a typical ASP.NET Core 6 website behind IIS on a Windows Server. After a few days the worker process (w3wp.exe) memory consumption grows from 2 gigs up to 25 gigs. Performing an IIS Stop / Start "fixes" the issue in the intermediate. I believe the issue is related to a FileWatcher that is not freeing up memory. We've had the issue since .NET 5 and noticed that someone else had a very similar issue (but not necessarily the same) found here #31125 and here #31219. It was supposedly fixed in .NET 6 which is why we upgraded, but we are still having the issue.
Here are some screenshots of dotMemory on the memory data dump when the production server got to 26gigs.
Top level snapshot.
High level inspection page.
Drill down to the Byte[] array section (Similar Retention Section).
Drill down to the Byte[].
Drill down to the OverlappedData section (Instances).
Drill down to an individual OverlappedData.
To Reproduce
I have yet to been able to reproduce this on any of our team's development boxes. This only occurs on production which is making it hard to pinpoint.
Exceptions (if any)
None
Further technical details
dotnet --info
:dotnet --info Output
Including screenshots of the IIS config so we are on the same page / just in case there is something misconfigured that we are unaware of.
Discussion / Side Notes
CreateDefaultBuilder
(https://github.com/dotnet/aspnetcore/blob/main/src/DefaultBuilder/src/WebHost.cs). Similar to issues Memory leak when restarting host on aspnet core 5 #31125 and IHost Memory leak with CreateDefaultBuilder #31219 where they cleared the sources and everything was fine.Code Snippets
Web.config
Program.cs
Startup.cs (only including this so you can see how to utilize a background hosted service)
MySingletonThatAddsToBackground.cs
ScopedWorker.cs
Questions
The text was updated successfully, but these errors were encountered: