Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

implementing hard limit for GC heap #22180

Merged
merged 1 commit into from
Jan 29, 2019
Merged

implementing hard limit for GC heap #22180

merged 1 commit into from
Jan 29, 2019

Conversation

Maoni0
Copy link
Member

@Maoni0 Maoni0 commented Jan 24, 2019

To support container scenario, 2 HardLimit configs are added -

GCHeapHardLimit - specifies a hard limit for the GC heap
GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use

If both are specified, GCHeapHardLimit is checked first and only when it's not specified
would we check GCHeapHardLimitPercent.

If neither is specified but the process is running inside a container with a memory
limit specified, we will take this as the hard limit:

max (20mb, 75% of the memory limit on the container)

If one of the HardLimit configs is specified, and the process is running inside a container
with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory
is still the memory limit on the container so when we calculate the memory load it's based
off the container memory limit.

An example,

process is running inside a container with 200mb limit
user also specified GCHeapHardLimit as 100mb.

if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load
is (50 + 100)/200 = 75%.

Some notes on these configs -

  • The limit is the commit size.

  • This is only supported on 64-bit.

  • For Server GC the minimum reserved segment size is 16mb per heap, this is to avoid the
    scenario where the hard limit is small but the process can use many procs and we end up
    with tiny segments which doesn't make sense. We then keep track of the committed on the segments
    so the total does not exceed the hard limit.

@Maoni0 Maoni0 changed the title [WIP] implementing hard limit for GC heap implementing hard limit for GC heap Jan 25, 2019
@Maoni0
Copy link
Member Author

Maoni0 commented Jan 25, 2019

FYI @andy-ms @jkotas @richlander @davidwrighton @janvorli

@andy-ms and I are still working on some further perf tuning on Linux, eg, make the default number of heaps better for Server GC based on the limit.

@@ -1877,7 +1877,7 @@ size_t GetCacheSizePerLogicalCpu(BOOL bTrueSize)
}
}

#if defined(_TARGET_AMD64_) || defined (_TARGET_X86_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why it is ok to delete this fallback for AMD64, but we still need it for x86?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only reason is because I have not tested on x86 so I am reluctant to remove it.

@@ -1877,7 +1877,7 @@ size_t GetCacheSizePerLogicalCpu(BOOL bTrueSize)
}
}

#if defined(_TARGET_AMD64_) || defined (_TARGET_X86_)
#if defined (_TARGET_X86_)
DefaultCatchFilterParam param;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are removing _AMD64_ here, you can also remove #ifdef _WIN64 inside this block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right...I haven't because this is for preview and there is a chance we might still want to do this after our perf testing. if the perf testing shows that there's no need to do this, I'll do some code clean up then.

GCHeapHardLimit - specifies a hard limit for the GC heap
GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use

If both are specified, GCHeapHardLimit is checked first and only when it's not specified
would we check GCHeapHardLimitPercent.

If neither is specified but the process is running inside a container with a memory
limit specified, we will take this as the hard limit:

max (20mb, 75% of the memory limit on the container)

If one of the HardLimit configs is specified, and the process is running inside a container
with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory
is still the memory limit on the container so when we calculate the memory load it's based
off the container memory limit.

An example,

process is running inside a container with 200mb limit
user also specified GCHeapHardLimit as 100mb.

if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load
is (50 + 100)/200 = 75%.

Some notes on these configs -

+ The limit is the commit size.

+ This is only supported on 64-bit.

+ For Server GC the minimum *reserved* segment size is 16mb per heap, this is to avoid the
scenario where the hard limit is small but the process can use many procs and we end up
with tiny segments which doesn't make sense. We then keep track of the committed on the segments
so the total does not exceed the hard limit.

fix
@PSanetra
Copy link

@Maoni0 why was 75% of the container memory limit chosen as the default limit? Is it expected that there is some other process running inside the same container?

@Maoni0
Copy link
Member Author

Maoni0 commented Jun 17, 2020

yes, we do expect normally you have some native memory usage or some other processes running.

@davidwrighton
Copy link
Member

Be aware that this control is for the managed heap. Even in a managed process there are other consumers of heap memory such as the runtime itself, various native allocations performed by the OS to support file and socket i/o, etc. 75% is a fairly conservative number, and especially with larger containers, a higher percentage is probably achievable safely.

@richlander
Copy link
Member

Previously, the value (through lack of having an algorithm here) was 100%. People reported OOMs. We picked a conservative value and we no longer see those reports. It's likely that many apps could tolerate a higher value, but we thought that this value would prevent the vast majority of apps from seeing OOMs, wouldn't require much validation on our part, and enables people to configure higher or lower values depending on their needs. I'm still feeling good about this set of choices.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…t/coreclr#22180)

GCHeapHardLimit - specifies a hard limit for the GC heap
GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use

If both are specified, GCHeapHardLimit is checked first and only when it's not specified
would we check GCHeapHardLimitPercent.

If neither is specified but the process is running inside a container with a memory
limit specified, we will take this as the hard limit:

max (20mb, 75% of the memory limit on the container)

If one of the HardLimit configs is specified, and the process is running inside a container
with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory
is still the memory limit on the container so when we calculate the memory load it's based
off the container memory limit.

An example,

process is running inside a container with 200mb limit
user also specified GCHeapHardLimit as 100mb.

if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load
is (50 + 100)/200 = 75%.

Some notes on these configs -

+ The limit is the commit size.

+ This is only supported on 64-bit.

+ For Server GC the minimum *reserved* segment size is 16mb per heap, this is to avoid the
scenario where the hard limit is small but the process can use many procs and we end up
with tiny segments which doesn't make sense. We then keep track of the committed on the segments
so the total does not exceed the hard limit.

Commit migrated from dotnet/coreclr@ed52a00
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
This is based on a perf test with 100% survival in a container, before
and after dotnet/coreclr#22180. GC pause times were greater after that commit.
Debugging showed that the reason was that after, we were always doing
compacting GC, and objects were staying in generation 1 and not making it
to generation 2. The reason was that in the "after" build,
`should_compact_loh()` was always returning true if heap_hard_limit was
set; currently if we do an LOH compaction, we compact all other
generations too. As the comment indicates, we should decide that
automatically, not just set it to true all the time.


Commit migrated from dotnet/coreclr@cc14e6c
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants