Skip to content

Conversation

mangod9
Copy link
Member

@mangod9 mangod9 commented Jul 16, 2025

Port change for better free-list management. This is a back port of #109431, and also applies a fix which was missing from the .NET 8 port.

Customer Impact

  • Customer reported
  • Found internally

Memory utilization regression as part of Regions Enablement. Reported by a customer here: #103582.
The fix is to improve distribute_free_regions where aged regions are added to decommit list to ultimately free.

Regression

  • Yes
  • No

Yes in memory utilization. For certain customers who were running with dense containers they would observe an OOM occasionally.

Testing

Verified with internal performance testing. Provided a private to the customer to try out and they confirmed their memory utilization improved after the fix.

Risk

Medium, this back port had caused a regression which required a revert in 8.19. We have done more validation to ensure the fix is working well.

@mangod9 mangod9 requested review from Maoni0 and mrsharm July 16, 2025 22:53
@Copilot Copilot AI review requested due to automatic review settings July 16, 2025 22:53
Copy link
Contributor

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reapplies the "distribute free regions" functionality to the garbage collection system, along with a small missing change from the original backport. The changes primarily involve refactoring and restructuring the region distribution logic in the CoreCLR garbage collector.

Key changes include:

  • Restructuring the free region distribution algorithm with improved organization and separation of concerns
  • Adding new region aging thresholds for different region types (basic, large, huge)
  • Introducing new helper methods for region management and memory load calculations

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/gc/gcpriv.h Updates enum definitions, adds new method declarations, and defines region aging constants
src/coreclr/gc/gc.cpp Major refactoring of distribute_free_regions logic with new helper methods and improved organization
Comments suppressed due to low confidence (1)

// just to reduce the number of #ifdefs in the code below
const int i = 0;
#endif //MULTIPLE_HEAPS
ptrdiff_t budget_gen = max (hp->estimate_gen_growth (gen), (ptrdiff_t)0);
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The explicit cast to (ptrdiff_t)0 is unnecessary since max() will handle the type promotion automatically. Consider using just 0 for better readability.

Suggested change
ptrdiff_t budget_gen = max (hp->estimate_gen_growth (gen), (ptrdiff_t)0);
ptrdiff_t budget_gen = max (hp->estimate_gen_growth (gen), 0);

Copilot uses AI. Check for mistakes.

{
gc_last_ephemeral_decommit_time = dd_time_clock (dd0);
size_t decommit_step_milliseconds = min (ephemeral_elapsed, (10*1000));
size_t decommit_step_milliseconds = min (ephemeral_elapsed, (size_t)(10 * 1000));
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number '10 * 1000' should be defined as a named constant to improve code maintainability and clarity about what this timeout represents.

Suggested change
size_t decommit_step_milliseconds = min (ephemeral_elapsed, (size_t)(10 * 1000));
size_t decommit_step_milliseconds = min (ephemeral_elapsed, (size_t)(MAX_DECOMMIT_TIME_MILLISECONDS));

Copilot uses AI. Check for mistakes.

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. please get a code review. we will take for consideration in 8.0.x

@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label Aug 11, 2025
@jeffschwMSFT jeffschwMSFT added this to the 8.0.x milestone Aug 11, 2025
Copy link
Member

@mrsharm mrsharm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on the testing and caveats discussed.

@jeffschwMSFT jeffschwMSFT changed the title Reapply distribute free regions [release/8.0-staging] Reapply distribute free regions Aug 12, 2025
@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Aug 12, 2025
@leecow leecow modified the milestones: 8.0.x, 8.0.21 Aug 12, 2025
@jeffschwMSFT jeffschwMSFT merged commit 5a23850 into release/8.0-staging Sep 5, 2025
120 of 124 checks passed
@jkotas jkotas deleted the reapply_distribute_free_regions branch September 8, 2025 17:34
@github-actions github-actions bot locked and limited conversation to collaborators Oct 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-GC-coreclr Servicing-approved Approved for servicing release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants