-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
If threading is not imported from the main thread it sees the wrong thread as the main thread. #81597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm attaching a snippet which shows the issue (i.e.: threading.main_thread() and threading.current_thread() should be the same and they aren't). What I'd see as a possible solution is that the initial thread ident would be stored when the interpreter is initialized and then when threading is imported the first time it would get that indent to initialize the main thread instead of calling I'm not sure if this is possible if CPython is embedded in some other C++ program, but it seems to be the correct approach when Python is called from the command line. As a note, I found this when doing an attach to pid for the |
The second import actually doesn't happen. You need to reload it. This can be tested by putting print('loading threading') in threading.py. Your first method-thread will still think it's main thread. So no idea if this is a bug or wrong use. 'import threading' should be one of the first lines in your main code/thread? import _thread
import time
import importlib
barrier = 0
def method():
import threading # Will make threading the wrong thread.
global barrier
print(threading.main_thread())
print(threading.current_thread())
barrier = 1
_thread.start_new_thread(method, ())
while barrier != 1:
time.sleep(.1)
import threading
importlib.reload(threading)
print(threading.main_thread())
print(threading.current_thread()) |
This is a bit tricky to explain... There's no easy way to achieve this effect "normally". It manifests due to the way some Python debuggers (specifically, pydevd and ptvsd - as used by PyCharm, PyDev, and VSCode) implement non-cooperative attaching to a running Python process by PID. A TL;DR take is that those debuggers have to inject a new thread into a running process from the outside, and then run some Python code on that thread. There are OS APIs for such thread injection - e.g. CreateRemoteThread on Windows. There are various tricks that they then have to use to safely acquire GIL and invoke PyEval_InitThreads, but ultimately it comes down to running Python code. That is the point where this can manifest. Basically, as soon as that injected code (i.e. the actual debugger) imports threading, things break. And advanced debuggers do need background threads for some functionality... Here are the technical details - i.e. how thread injection happens exactly, and what kind of code it might run - if you're interested. I think that a similar problem can also occur in an embedded Python scenario with multithreading. Consider what happens if the hosted interpreter is initialized from the main thread of the host app - but some Python code is then run from the background thread, and that code happens to be the first in the process to import threading. Then that background thread becomes the "main thread" for threading, with the same results as described above. The high-level problem, I think, is that there's an inconsistency between what Python itself considers "main thread" (i.e. main_thread in ceval.c, as set by PyEval_InitThreads), and what threading module considers "main thread" (i.e. _main_thread in threading.py). Logically, these should be in sync. If PyEval_InitThreads is the source of truth, then the existing thread injection technique will "just work" as implemented already in all the aforementioned debuggers. It makes sure to invoke PyEval_InitThreads via Py_AddPendingCall, rather than directly from the background thread, precisely so that the interpreter doesn't get confused. Furthermore, on 3.7+, PyEval_InitThreads is already automatically invoked by Py_Initialize, and hence when used by python.exe, will mark the actual first thread in the process as the main thread. So, using it a the source of truth would guarantee that attach by thread injection works correctly in all non-embedded Python scenarios. Apps hosting Python would still need to ensure that they always call Py_Initialize on what they want to be the main thread, as they already have to do; but they wouldn't need to worry about "import threading" anymore. |
It's also possible to hit if using some native code that starts a background thread without going via threading, and runs Python code on that background thread. In that case, if that Python code then does "import threading", and threading hasn't been imported yet, then you have this same problem. Here's a pure Python repro using ctypes on Win32: #-------------------------- import sys, time
from ctypes import *
ThreadProc = WINFUNCTYPE(c_uint32, c_void_p)
@ThreadProc
def thread_proc(_):
import threading
print(threading.current_thread() is threading.main_thread())
return 0 assert "threading" not in sys.modules windll.kernel32.CreateThread(None, c_size_t(0), thread_proc, None, c_uint32(0), None) assert "threading" in sys.modules Here's the output: >py -3 main.py
True
False
Exception ignored in: <module 'threading' from 'C:\\Python\\3.7-64\\lib\\threading.py'>
Traceback (most recent call last):
File "C:\Python\3.7-64\lib\threading.py", line 1276, in _shutdown
assert tlock.locked()
AssertionError: |
Understood. Thank you for the extra info. I'll read up on all the suggestions. |
Note that in Python 3.7 we consolidated a bunch of the global runtime state. The "main_thread" static in ceval.c moved to the internal struct _PyRuntimestate and in 3.7 it is accessible as _Runtime.ceval.pending.main_thread (but only if you build with Py_BUILD_CORE, which extensions shouldn't). In 3.8 it was moved to the top of the struct as _PyRuntimeState.main_thread. All that said, that thread ID is still not exposed directly in Python. So perhaps it would make sense to add something like sys.get_main_thread_id, which Lib/threading.py could then use (rather than assuming the current (during import) thread is the main thread). Unfortunately, this wouldn't help with anything earlier than 3.9. Currently 3.8 is already in beta1, where we apply a feature freeze, so it's barely too late for 3.8. I suppose you could ask for a special exception from the release manager, given the change would be relatively small, with real benefits, but don't expect it. :) |
FWIW, this might also have implications for thread shutdown during runtime/interpreter finalization. |
Debuggers will have to work around this in past Python versions that they support (which still includes Python 2 for pretty much all of them), so this is solely about resolving the inconsistency for the future. No point rushing it for 3.8 specifically. (The most likely immediate workaround will be that, instead of invoking PyEval_InitThreads on the main thread via Py_AddPendingCall, we will simply use the same facility to exec "import threading" on the main thread.) |
This is a duplicate of bpo-31517. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: