-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/8.0-staging] Guard against -1 Returned from sysconf for the Cache Sizes Causing Large Gen0 Sizes and Budgets for Certain Linux Distributions. #100575
Conversation
Tagging subscribers to this area: @dotnet/gc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
FWIW, @Maoni0 noticed that the error handling is not right during codereview two years ago: #71029 (comment) . Unfortunately, this feedback was not followed up upon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved. we will take for consideration in 8.0.x
The build failures are known issues: #100035 => This PR can be merged as the errors aren't related. |
/ba-g all failures were classified as Known but the check did not flip to green. |
Backport of #100502 to release/8.0-staging
/cc @mrsharm
Customer Impact
All runtimes running server GC running on Linux distributions that return -1 for the LEVEL4_CACHE_SIZE from sysconf are affected by the underlying issue that this PR addresses. The implications of this issue are that the gen0 budget is computed as a function of the total memory available and not the L3 cache size as before for certain Linux distributions such as Debian 12, and Ubuntu 22.04. Relying on the physical memory which is much larger than the L3 cache size results in extremely large gen0 budgets and thereby causing larger heap sizes and consequently, more memory utilization.
The internal customer impact reduced their overall memory utilization from 1 GB to 300 MB and were satisfied with the result.
Regression
This regression is as a result of changes to the result of sysconf for certain Linux distributions. The result that we are now guarding against is -1. The code that introduced this behavior by not vetting the output of sysconf (a long) and casting it to a size_t was: #71029; this was in the .NET 7 timeframe but probably didn't emanate to our user base since the values of sysconf for the L4 Cache Size for newer distributions started showing up as -1 if they didn't exist and where in previous cases such as for Debian 11, they had been 0. This was confirmed from running
getconf -a | grep CACHE
on multiple distributions of Linux and the ones that exhibited this behavior e.g., Debian 12 indicated that there had been a change in the behavior by outputting -1.Testing
Risk
The risk of the fix is high as the code path is called for all SVR instances for Linux. However, this is a fix for an already existing issue and has been tested and verified.
Full Details can be found: #100502