-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/8.0] Fix a possible infinite wait for GC completion at process shutdown. #107844
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…own on Windows (dotnet#103877)" Fixes:107800 * Use RtlDllShutdownInProgress to detect process shutdown on Windows Switching to cooperative mode is not safe during process shutdown on Windows. Process shutdown can terminate a thread in the middle of the GC. The shutdown thread deadlocks if it tries to switch to cooperative mode and wait for the GC to finish in this situation. Use RtlDllShutdownInProgress Windows API to detect process shutdown to avoid waiting for GC completion when that may lead to deadlocks.
Tagging subscribers to this area: @mangod9 |
mangod9
reviewed
Sep 16, 2024
mangod9
approved these changes
Sep 16, 2024
Co-authored-by: Manish Godse <61718172+mangod9@users.noreply.github.com>
jeffschwMSFT
approved these changes
Sep 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. we will take for consideration in 8.0.x
rbhanda
added
Servicing-approved
Approved for servicing release
and removed
Servicing-consider
Issue for next servicing release review
labels
Sep 19, 2024
jeffschwMSFT
merged commit Sep 19, 2024
1f0e1bd
into
dotnet:release/8.0-staging
122 of 129 checks passed
Thanks!! |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes: #107800
This is a partial/minimal port of #103877.
Cooperative process cleanup is fragile and #103877 addresses many potential issues, however the change is not small and in parts works on top of 9.0 changes.
This is a port of a small part of the change to address a specific scenario that is known to affect end users.
Customer Impact
Bug was reported by internal partners. In some relatively infrequent cases a worker process may get stuck at exiting.
Such "stuck" processes could become a nuisance, especially when the memory footprint of workers is very large.
Regression
Appears to be introduced in .NET 6 as the repro scenario passes with 5.0, but deadlocks in 6.0, 8.0 and early 9.0 previews
Testing
Added a targeted unit test.
Risk
Small.
The code already tries to detect if the process is shutting down. We just use a more reliable mechanism - a new Windows API introduced in Win10 (
RtlDllShutdownInProgress
)The main concern is that there could be other similar issues.
The 9.0 fix addresses several more patterns similar to the one involved here. They may or may not result in actual failures and there is some added risk that proactive fixing of other areas may actually break something, which we decided not to do in a servicing fix.