-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use RtlDllShutdownInProgress to detect process shutdown on Windows #103877
Conversation
This is not 100% reliable fix, but it matches what we do in similar situations during shutdown currently. |
Tagging subscribers to this area: @mangod9 |
Unfortunately, this fix does not work well. It replaces intermittent hang with intermittent crash during shutdown. |
What if we do the alloc context fixing during the destruction of the RuntimeThreadLocals object? Might require us to move the PreemptiveGCDisabled bool over first though. |
I do not see how it would help. This is the sequence that leads to the hang:
The problem with the current fix is that it is starts skipping the cleanup too early while there are still multiple threads running in the process. It means that we can end up running a GC without cleaned up alloc context that the GC gets very unhappy about. I have been looking for a more precise detection of the moment after which it is fine to start skipping the cleanup. |
eb524c2
to
6b18f2a
Compare
Ok, I think I found something that works. I have also fixed other instances of thread cleanup switching to cooperative mode that I found during testing and review. |
Switching to cooperative mode is not safe during process shutdown on Windows. Process shutdown can terminate a thread in the middle of the GC. The shutdown thread deadlocks if it tries to switch to cooperative mode and wait for the GC to finish in this situation. Use RtlDllShutdownInProgress Windows API to detect process shutdown to skip cleanup that has to be done in cooperative mode. The existing g_fProcessDetach flag is set too late - using this flag to skip cooperative mode switch would lead to shutdown deadlocks, and the existing g_fEEShutDown flag is set too early - using this flag to skip cooperative mode switch would lead to shutdown crashes. Fixes dotnet#103624
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is race condition city, and I think that there are still races that exist. Notably, I don't understand when the GC thread can be eliminated without running its cleanup logic, so I don't know how to analyze this for complete correctness, but based on the description you provided this should strictly reduce the set of deadlocks that are likely to occur.
Any shutdown with ongoing background activity is prone to this problem. What happens is:
Yes, I agree. The code has a lot of checks that should be either unnecessary or that have race conditions. |
return g_fProcessDetach; | ||
#else | ||
// RtlDllShutdownInProgress provides more accurace information about whether the process is shutting down. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accurace
-> accurate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - fix included in #104318
shutdown Port dotnet#103877 to Native AOT. This is fixing intermittent shutdown hangs that can observed by running the tests attached to dotnet#103877.
Switching to cooperative mode is not safe during process shutdown on Windows. Process shutdown can terminate a thread in the middle of the GC. The shutdown thread deadlocks if it tries to switch to cooperative mode and wait for the GC to finish in this situation.
Use RtlDllShutdownInProgress Windows API to detect process shutdown to skip cleanup that has to be done in cooperative mode.
The existing g_fProcessDetach flag is set too late - using this flag to skip cooperative mode switch would lead to shutdown deadlocks, and the existing g_fEEShutDown flag is set too early - using this flag to skip cooperative mode switch would lead to shutdown crashes.
Fixes #103624