-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-124878: Fix race conditions during interpreter finalization #130649
Conversation
The PyThreadState field gains a reference count field to avoid issues with PyThreadState being a dangling pointer to freed memory. The refcount starts with a value of two: one reference is owned by the interpreter's linked list of thread states and one reference is owned by the OS thread. The reference count is decremented when the thread state is removed from the interpreter's linked list and before the OS thread calls `PyThread_hang_thread()`. The thread that decrements it to zero frees the `PyThreadState` memory. The `holds_gil` field is moved out of the `_status` bit field, to avoid a data race where on thread calls `PyThreadState_Clear()`, modifying the `_status` bit field while the OS thread reads `holds_gil` when attempting to acquire the GIL. The `PyThreadState.state` field now has `_Py_THREAD_SHUTTING_DOWN` as a possible value. This corresponds to the `_PyThreadState_MustExit()` check. This avoids race conditions in the free threading build when checking `_PyThreadState_MustExit()`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Avoiding dangling pointers close a whole category of bugs. Using a reference count is a nice solution for that. I also like the fact that Py_Finalize() now sets threads state to "shutting down": it's more explicit like that.
By the way, this change may also fix the old crash #110052 (test_4_daemon_threads). |
Yeah, I think it should fix
And also a small patch to ceval_gil.c to make the crash more likely to occur on my machine. On main (with the patch to ceval_gil.c), I see a crash pretty quickly after ~100 iterations on both the default and free threaded builds. With this PR, I haven't seen any failures in 25,000 iterations (~10 minutes). Patch to ceval_gil.cdiff --git a/Python/ceval_gil.c b/Python/ceval_gil.c
index 2c1cc17b2ff..e14f1a8afa2 100644
--- a/Python/ceval_gil.c
+++ b/Python/ceval_gil.c
@@ -306,6 +306,8 @@ take_gil(PyThreadState *tstate)
_PyThreadState_HangThread(tstate);
}
+ usleep(10);
+
assert(_PyThreadState_CheckConsistency(tstate));
PyInterpreterState *interp = tstate->interp;
struct _gil_runtime_state *gil = interp->ceval.gil; |
|
The PyThreadState field gains a reference count field to avoid issues with PyThreadState being a dangling pointer to freed memory. The refcount starts with a value of two: one reference is owned by the interpreter's linked list of thread states and one reference is owned by the OS thread. The reference count is decremented when the thread state is removed from the interpreter's linked list and before the OS thread calls
PyThread_hang_thread()
. The thread that decrements it to zero frees thePyThreadState
memory.The
holds_gil
field is moved out of the_status
bit field, to avoid a data race where on thread callsPyThreadState_Clear()
, modifying the_status
bit field while the OS thread readsholds_gil
when attempting to acquire the GIL.The
PyThreadState.state
field now has_Py_THREAD_SHUTTING_DOWN
as a possible value. This corresponds to the_PyThreadState_MustExit()
check. This avoids race conditions in the free threading build when checking_PyThreadState_MustExit()
.