-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
race condition in threading when interpreter finalized while daemon thread runs (thread sanitizer identified) #124878
Comments
Quoting a comment from @mpage in a #105805 comment as this is somewhat tstate use after it has been freed related: """ A reference count on the tstate is interesting, presuming that is not something that'd need to change frequently? Mostly I see it as a "if the thread still exists, the PyThreadState should never be freed" marker - do we have a single clear marker of who owns deallocation of thread states? For daemon threads that we're abandoning during finalization we clearly should not be freeing their PyThreadState's today if that is what the above race actually shows as happening. |
This bug probably exists on older versions as well, I didn't try testing further back. In general: Friends don't let friends spawn daemon threads. For the health of the process and all code maintainers. |
Bug report
Bug description:
Using the code in #105805 with the newly added
test.test_threading.ThreadTests.test_finalize_daemon_thread_hang
test enabled you can reproduce this thread sanitizer crash as follows (I used clang 18):This also happens if I just take the new test and corresponding Modules/_testcapimodule.c change and patch it on top of
main
- it's a pre-existing bug not related to my PR adding the new test. (Filing now before I check this in decorated to be skipped under sanitizers so I can reference the issue number in a comment)Examining the code in question where the race occurs... it's this block https://github.com/python/cpython/blob/v3.13.0rc3/Python/ceval_gil.c#L258
looping in @ericsnowcurrently for #104341 context.
The
int final_release
value in that call stack is 0 so the next bit tries to load the eval breaker bit but the thread was woken up by python code executing during finalization of the main thread per the test.How'd thread T1 ever obtain the GIL upon waking up in the first place given finalization had started?
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: