-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-96387: take_gil() resets drop request before exit #96869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
At Python exit, sometimes a thread holding the GIL can wait forever for a thread (usually a daemon thread) which requested to drop the GIL, whereas the thread already exited. To fix the race condition, the thread which requested the GIL drop now resets its request before exiting. take_gil() now calls RESET_GIL_DROP_REQUEST() before PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a race condition with drop_gil(). Issue discovered and analyzed by Mingliang ZHAO.
@gpshead: Would you mind to review this change? |
// block forever waiting for the thread which exited. Drop | ||
// requests made by other threads are also reset: these threads | ||
// may have to request again a drop request (iterate one more | ||
// time). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bugfix can slowdown the Python shutdown by a few "switch interval" seconds (5 ms by default). I expect that the worst case is when all daemon threads need one more iterations, so interval x number of threads: like 500 ms for 100 threads. But with luck, it's way shorter because the thread holding the GIL can get the switch request (and so let other threads exit) before the request is reset ;-)
My expectation is that... nobody will notice, since daemon threads are rare, and this race condition is unlikely ;-)
If tomorrow it becomes a problem, maybe the "GIL drop request" should store the tstate of the thread requesting it, and the request should only be reset if it matchs the current tstate. Not sure if it will speed up the shutdown. Again, I expect that it will not really affect performance at the end ;-)
@pitrou: If you're around, it would be nice if you could have a look, since you designed the new Pyhon GIL ;-) |
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11. |
Sorry @vstinner, I had trouble checking out the |
Sorry, @vstinner, I could not cleanly backport this to |
GH-96941 is a backport of this pull request to the 3.11 branch. |
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11. |
Sorry @vstinner, I had trouble checking out the |
At Python exit, sometimes a thread holding the GIL can wait forever for a thread (usually a daemon thread) which requested to drop the GIL, whereas the thread already exited. To fix the race condition, the thread which requested the GIL drop now resets its request before exiting. take_gil() now calls RESET_GIL_DROP_REQUEST() before PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a race condition with drop_gil(). Issue discovered and analyzed by Mingliang ZHAO. (cherry picked from commit 04f4977)
…96869) (pythonGH-96941) At Python exit, sometimes a thread holding the GIL can wait forever for a thread (usually a daemon thread) which requested to drop the GIL, whereas the thread already exited. To fix the race condition, the thread which requested the GIL drop now resets its request before exiting. take_gil() now calls RESET_GIL_DROP_REQUEST() before PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a race condition with drop_gil(). Issue discovered and analyzed by Mingliang ZHAO. (cherry picked from commit 04f4977) (cherry picked from commit 6ff5471) Co-authored-by: Victor Stinner <vstinner@python.org>
…6941) At Python exit, sometimes a thread holding the GIL can wait forever for a thread (usually a daemon thread) which requested to drop the GIL, whereas the thread already exited. To fix the race condition, the thread which requested the GIL drop now resets its request before exiting. take_gil() now calls RESET_GIL_DROP_REQUEST() before PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a race condition with drop_gil(). Issue discovered and analyzed by Mingliang ZHAO. (cherry picked from commit 04f4977) (cherry picked from commit 6ff5471) Co-authored-by: Victor Stinner <vstinner@python.org>
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11. |
Sorry, @vstinner, I could not cleanly backport this to |
The 3.11 backport is #96941 |
Hi, I wanted to check if this needs to be backported to 3.9 as well? Many thanks |
3.9 is in security maintenance mode, and this doesn't look like a security-related patch, so probably no. |
At Python exit, sometimes a thread holding the GIL can wait forever for a thread (usually a daemon thread) which requested to drop the GIL, whereas the thread already exited. To fix the race condition, the thread which requested the GIL drop now resets its request before exiting.
take_gil() now calls RESET_GIL_DROP_REQUEST() before PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a race condition with drop_gil().
Issue discovered and analyzed by Mingliang ZHAO.