implementing hard limit for GC heap #22180

Maoni0 · 2019-01-24T08:30:38Z

To support container scenario, 2 HardLimit configs are added -

GCHeapHardLimit - specifies a hard limit for the GC heap
GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use

If both are specified, GCHeapHardLimit is checked first and only when it's not specified
would we check GCHeapHardLimitPercent.

If neither is specified but the process is running inside a container with a memory
limit specified, we will take this as the hard limit:

max (20mb, 75% of the memory limit on the container)

If one of the HardLimit configs is specified, and the process is running inside a container
with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory
is still the memory limit on the container so when we calculate the memory load it's based
off the container memory limit.

An example,

process is running inside a container with 200mb limit
user also specified GCHeapHardLimit as 100mb.

if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load
is (50 + 100)/200 = 75%.

Some notes on these configs -

The limit is the commit size.
This is only supported on 64-bit.
For Server GC the minimum reserved segment size is 16mb per heap, this is to avoid the
scenario where the hard limit is small but the process can use many procs and we end up
with tiny segments which doesn't make sense. We then keep track of the committed on the segments
so the total does not exceed the hard limit.

Maoni0 · 2019-01-25T22:30:24Z

FYI @Andy-MS @jkotas @richlander @davidwrighton @janvorli

@Andy-MS and I are still working on some further perf tuning on Linux, eg, make the default number of heaps better for Server GC based on the limit.

jkotas · 2019-01-25T22:39:21Z

src/vm/util.cpp

@@ -1877,7 +1877,7 @@ size_t GetCacheSizePerLogicalCpu(BOOL bTrueSize)
        }
    }

-#if defined(_TARGET_AMD64_) || defined (_TARGET_X86_)


Is there a reason why it is ok to delete this fallback for AMD64, but we still need it for x86?

the only reason is because I have not tested on x86 so I am reluctant to remove it.

jkotas · 2019-01-25T22:39:27Z

src/vm/util.cpp

@@ -1877,7 +1877,7 @@ size_t GetCacheSizePerLogicalCpu(BOOL bTrueSize)
        }
    }

-#if defined(_TARGET_AMD64_) || defined (_TARGET_X86_)
+#if defined (_TARGET_X86_)
    DefaultCatchFilterParam param;


If you are removing _AMD64_ here, you can also remove #ifdef _WIN64 inside this block.

right...I haven't because this is for preview and there is a chance we might still want to do this after our perf testing. if the perf testing shows that there's no need to do this, I'll do some code clean up then.

GCHeapHardLimit - specifies a hard limit for the GC heap GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use If both are specified, GCHeapHardLimit is checked first and only when it's not specified would we check GCHeapHardLimitPercent. If neither is specified but the process is running inside a container with a memory limit specified, we will take this as the hard limit: max (20mb, 75% of the memory limit on the container) If one of the HardLimit configs is specified, and the process is running inside a container with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory is still the memory limit on the container so when we calculate the memory load it's based off the container memory limit. An example, process is running inside a container with 200mb limit user also specified GCHeapHardLimit as 100mb. if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load is (50 + 100)/200 = 75%. Some notes on these configs - + The limit is the commit size. + This is only supported on 64-bit. + For Server GC the minimum *reserved* segment size is 16mb per heap, this is to avoid the scenario where the hard limit is small but the process can use many procs and we end up with tiny segments which doesn't make sense. We then keep track of the committed on the segments so the total does not exceed the hard limit. fix

PSanetra · 2020-06-17T14:06:55Z

@Maoni0 why was 75% of the container memory limit chosen as the default limit? Is it expected that there is some other process running inside the same container?

Maoni0 · 2020-06-17T19:36:55Z

yes, we do expect normally you have some native memory usage or some other processes running.

davidwrighton · 2020-06-17T21:10:35Z

Be aware that this control is for the managed heap. Even in a managed process there are other consumers of heap memory such as the runtime itself, various native allocations performed by the OS to support file and socket i/o, etc. 75% is a fairly conservative number, and especially with larger containers, a higher percentage is probably achievable safely.

richlander · 2020-06-18T23:01:54Z

Previously, the value (through lack of having an algorithm here) was 100%. People reported OOMs. We picked a conservative value and we no longer see those reports. It's likely that many apps could tolerate a higher value, but we thought that this value would prevent the vast majority of apps from seeing OOMs, wouldn't require much validation on our part, and enables people to configure higher or lower values depending on their needs. I'm still feeling good about this set of choices.

…t/coreclr#22180) GCHeapHardLimit - specifies a hard limit for the GC heap GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use If both are specified, GCHeapHardLimit is checked first and only when it's not specified would we check GCHeapHardLimitPercent. If neither is specified but the process is running inside a container with a memory limit specified, we will take this as the hard limit: max (20mb, 75% of the memory limit on the container) If one of the HardLimit configs is specified, and the process is running inside a container with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory is still the memory limit on the container so when we calculate the memory load it's based off the container memory limit. An example, process is running inside a container with 200mb limit user also specified GCHeapHardLimit as 100mb. if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load is (50 + 100)/200 = 75%. Some notes on these configs - + The limit is the commit size. + This is only supported on 64-bit. + For Server GC the minimum *reserved* segment size is 16mb per heap, this is to avoid the scenario where the hard limit is small but the process can use many procs and we end up with tiny segments which doesn't make sense. We then keep track of the committed on the segments so the total does not exceed the hard limit. Commit migrated from dotnet/coreclr@ed52a00

This is based on a perf test with 100% survival in a container, before and after dotnet/coreclr#22180. GC pause times were greater after that commit. Debugging showed that the reason was that after, we were always doing compacting GC, and objects were staying in generation 1 and not making it to generation 2. The reason was that in the "after" build, `should_compact_loh()` was always returning true if heap_hard_limit was set; currently if we do an LOH compaction, we compact all other generations too. As the comment indicates, we should decide that automatically, not just set it to true all the time. Commit migrated from dotnet/coreclr@cc14e6c

Maoni0 force-pushed the oom branch 5 times, most recently from 92af082 to 052f5a2 Compare January 25, 2019 22:04

Maoni0 changed the title ~~[WIP] implementing hard limit for GC heap~~ implementing hard limit for GC heap Jan 25, 2019

jkotas reviewed Jan 25, 2019

View reviewed changes

Maoni0 force-pushed the oom branch from 052f5a2 to fdbbfc9 Compare January 26, 2019 07:41

Maoni0 merged commit ed52a00 into dotnet:master Jan 29, 2019

vkvenkat mentioned this pull request Feb 26, 2019

Fix Linux FP exception when NUMA nodes greater than 1 #22861

Merged

richlander mentioned this pull request Feb 27, 2019

Proposal for docker limits dotnet/designs#49

Merged

ghost mentioned this pull request Mar 20, 2019

Don't compact LOH just because a heap_hard_limit exists #23366

Merged

Maoni0 mentioned this pull request Jun 4, 2019

Memory leak dotnet/aspnetcore#10771

Closed

jkotas mentioned this pull request Jul 7, 2019

Use the GC provided environments instead of the PAL ones dotnet/corert#7596

Merged

Maoni0 deleted the oom branch August 30, 2019 23:13

Maoni0 mentioned this pull request Nov 15, 2019

GC fundamentals clean up dotnet/docs#15761

Merged

AndyAyersMS mentioned this pull request Jan 31, 2020

ByteMark BenchBitOps regression dotnet/runtime#12016

Closed

Maoni0 mentioned this pull request Mar 27, 2020

Default values missing dotnet/docs#17628

Closed

gbalykov mentioned this pull request Apr 16, 2024

Add GC heap hard limit for 32 bit dotnet/runtime#101024

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implementing hard limit for GC heap #22180

implementing hard limit for GC heap #22180

Maoni0 commented Jan 24, 2019 •

edited

Loading

Maoni0 commented Jan 25, 2019 •

edited

Loading

jkotas Jan 25, 2019

Maoni0 Jan 25, 2019

jkotas Jan 25, 2019

Maoni0 Jan 29, 2019

PSanetra commented Jun 17, 2020

Maoni0 commented Jun 17, 2020

davidwrighton commented Jun 17, 2020

richlander commented Jun 18, 2020

implementing hard limit for GC heap #22180

implementing hard limit for GC heap #22180

Conversation

Maoni0 commented Jan 24, 2019 • edited Loading

Maoni0 commented Jan 25, 2019 • edited Loading

jkotas Jan 25, 2019

Choose a reason for hiding this comment

Maoni0 Jan 25, 2019

Choose a reason for hiding this comment

jkotas Jan 25, 2019

Choose a reason for hiding this comment

Maoni0 Jan 29, 2019

Choose a reason for hiding this comment

PSanetra commented Jun 17, 2020

Maoni0 commented Jun 17, 2020

davidwrighton commented Jun 17, 2020

richlander commented Jun 18, 2020

Maoni0 commented Jan 24, 2019 •

edited

Loading

Maoni0 commented Jan 25, 2019 •

edited

Loading