Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approximate the heap_hard_limit #38178

Merged

Conversation

cshung
Copy link
Member

@cshung cshung commented Jun 20, 2020

It appears to me that the field gc_heap::heap_hard_limit is also used as a heuristic to determine generation to condemn. If we set it to 1, the heuristic would think we are going to run out of memory and keep triggering gen 2 GC.

This change approximates the heap_hard_limit value by adding the individual object heaps. This is an approximation because we discounted the memory that would be used for the auxiliary structures (such as the card table)

@dotnet/gc

@cshung cshung requested a review from Maoni0 June 20, 2020 00:06
@ghost
Copy link

ghost commented Jun 20, 2020

Tagging subscribers to this area: @Maoni0
Notify danmosemsft if you want to be subscribed.

Copy link
Member

@Maoni0 Maoni0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@cshung cshung merged commit f997ed7 into dotnet:master Jun 25, 2020
@mjsabby
Copy link
Contributor

mjsabby commented Jun 26, 2020

Is this a regression? What is the impact of this change for someone using Large Pages + hard limit?

@cshung
Copy link
Member Author

cshung commented Jun 26, 2020

Is this a regression? What is the impact of this change for someone using Large Pages + hard limit?

@mjsabby - In short - yes, it is - and this change will make large Pages + hard limit better.

The story started with your discovery that the 3x commit introduced by POH is a blocker for you.

The root cause of the 3x commit is because when given a single hard limit, we do not know if the application is going to use that for which object heap, and therefore we assumed the worst.

For the normal scenario without large pages, that is okay because we just reserved the memory without committing it.

With large pages, implementation forces us to commit upfront, and therefore we committed 3x the hard limit, and that is not okay

Therefore I introduced a new way to specify the hard limit per object heap, that eliminated the guesswork the runtime is doing:

By specifying the hard limit for individual object heaps, the runtime will commit exactly as specified.

As an example, I wish to have 1G in SOH, 2G in LOH, and 500M in POH, you would specify

COMPLUS_GCHeapHardLimitSOH=1G (in hex)
COMPLUS_GCHeapHardLimitLOH=2G (in hex)
COMPLUS_GCHeapHardLimitPOH=500M (in hex)

The runtime will reserve (or commit in large page case) 3.5G upfront and distribute them as instructed.

Just to be clear, if the application happens to allocate more than 1G in SOH, then it will OOM, regardless of whether or not we still have memory in LOH or POH.

After that, I worked on testing it. After #37725, the code works functionally, but it has a performance bug. When individual heap hard limits are provided, the heuristic to determine which generation to condemn is broken. In particular, it chose to perform a gen 2 background GC always.

This is caused by my ignorance. I thought the field heap_hard_limit is used only for checking whether or not we exceed the limit, turn out it is also used to determine available memory, and thus impact the choices the heuristic would make.

By setting heap_hard_limit to approximately what it should be, this change fixed the heuristic, and it will choose the same generation to condemn as it sees fit.

So for a longer summary:

  • The 3x commit issue is solved
  • My implementation had a couple of bugs, but they are fixed by now.

@cshung cshung deleted the public/dev/andrewau/fix-limit-heuristics branch June 26, 2020 17:27
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants