-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic heap count #86245
Dynamic heap count #86245
Conversation
…e a bit of testing.
- park extra threads on gc_idle_thread_event - update thread count in join - null free lists and have background GC rebuild them - have redistribute_regions call fix_allocation_contexts - distribute free regions as well - fix up ephemeral_heap_segment and generation_allocation_segment
Add hack to dynamically enable heap verify when we change the heap count.
- move finalization data between heaps - update free list space per heap when rethreading free lists - update allocation contexts so they don't reference decommissioned heaps - allow redistribute_regions to fail - allow enter_spin_lock to fail when called from try_allocate_more_space - don't decommision heaps with a taken more space lock - poison dynamic data and generation table for decommissioned heaps
- Add instrumentation - Be more careful regarding signed vs. unsigned types to make GCC happy.
…ecking for containing heap to verify_free_lists, adding checking decommissioned heaps to verify_heap.
…ntly cannot handle that
…e a bit of testing.
- park extra threads on gc_idle_thread_event - update thread count in join - null free lists and have background GC rebuild them - have redistribute_regions call fix_allocation_contexts - distribute free regions as well - fix up ephemeral_heap_segment and generation_allocation_segment
Add hack to dynamically enable heap verify when we change the heap count.
- move finalization data between heaps - update free list space per heap when rethreading free lists - update allocation contexts so they don't reference decommissioned heaps - allow redistribute_regions to fail - allow enter_spin_lock to fail when called from try_allocate_more_space - don't decommision heaps with a taken more space lock - poison dynamic data and generation table for decommissioned heaps
- Add instrumentation - Be more careful regarding signed vs. unsigned types to make GCC happy.
Tagging subscribers to this area: @dotnet/gc Issue DetailsThis is an initial implementation for changing the GC heap count dynamically in response to changing load conditions. Using more heaps will increase memory footprint, but in most cases also improve throughput because more GC work is parallelized, and lock contention on the allocation code path is reduced by spreading the load. The algorithm used makes this tradeoff explicit and increases the heap count by comparing the estimated percentage increase in throughput with the estimated percentage increase in memory footprint. It increases the heap count if the throughput is estimated to increase at least one percentage point more than the memory footprint increase, and decreases the heap count if the estimated reduction in memory footprint is at least one percentage point more than the decrease in throughput. Because the input data for GC pause etc. are quite noisy, we use a median of 3 filter before the data is used to make decisions. Preliminary data suggests this is effective, but probably not enough.
|
- rename config to GCDynamicAdaptation - change dynamic heap count dprintf level so we can print just those messages - aim for a percentage overhead reading between 1 and 5% - if above 10%, ramp up agressively, if above 5%, ramp up a step, if below 1% and significant space gains are possible, ramp down a step. - make space cost computation per heap more realistic - use min gen 0 budget
…ze to 2.5 MB if COMPLUS_GCDynamicAdaption is enabled.
…mic_data - we had moved regions to other heaps, so total_gen_size became 0. We had adjusted generation_free_list_space for this when rethreading the free lists, but not generation_free_obj_space. So dd_current_size became a large positive number as a result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PeterSolMS and I talked about this and we do want to get this in for Preview 5 to get more testing. we've already gone through the changes offline.
… for the assert, which is likely some faulty bookkeeping.
Regarding the assert failures in |
Is this something which needs to be fixed before merging? |
No, this is a lurking issue that has nothing to do with dynamic heap count. It happened without the feature active, and all the cases I looked at were on WKS GC, and were benign in the sense that there was no impact on the budget computation or other correctness aspects. |
Looks like the failures are known per Build-Analysis. Should be ok to merge. |
Agree. Note that I have a PR out for the assert failure. |
This is an initial implementation for changing the GC heap count dynamically in response to changing load conditions.
Using more heaps will increase memory footprint, but in most cases also improve throughput because more GC work is parallelized, and lock contention on the allocation code path is reduced by spreading the load.
The algorithm used makes this tradeoff explicit and increases the heap count by comparing the estimated percentage increase in throughput with the estimated percentage increase in memory footprint. It increases the heap count if the throughput is estimated to increase at least one percentage point more than the memory footprint increase, and decreases the heap count if the estimated reduction in memory footprint is at least one percentage point more than the decrease in throughput.
Because the input data for GC pause etc. are quite noisy, we use a median of 3 filter before the data is used to make decisions. Preliminary data suggests this is effective, but probably not enough.