-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct C API Usage Logic for NO_GIL Multi-threading #133
Comments
Hi @DanielLee343 - It would help if you further explain what you are trying to do and your high level motivations. For example, you say you want to mimic "gc_get_objects_impl" - why aren't you using Are you trying to do this in the nogil fork or in the CPython main branch (3.13 development)? As Terry wrote, the Cpython main branch nogil support is still in development and not ready for testing.
What do you mean by "apparently it's holding the GIL"? As Terry wrote, there is no support for running without the GIL in the CPython main branch. It's still under development. In the nogil forks, it does not really hold the GIL, but the calls are still necessary. That's the whole bit about attaching and deatching. Any place you see in the docs that says that a thread must hold the gil, you should read as "thread must be attached", but the way you do it is the same: |
@colesbury Thanks for clarifying. My high level goal is to do some statistical analysis of PyObjects in some Python applications during runtime, and use some semantics for the research. Thus, the primary goal is to obtain all PyObjects in some manner. Previously I was using the Since the GC list already holds all container objects, inserted during initialization, I can loop through GC list, for each tracked PyObject, I do a recursive tracing, until each PyObject is not iterable. I cannot directly use C implementation of for (i = 0; i < PyTuple_GET_SIZE(args); i++)
{
...
if (!_PyObject_IS_GC(obj))
continue;
traverse = Py_TYPE(obj)->tp_traverse;
if (!traverse)
continue;
...
} For example, if a Python application defines: >>> matrix_size = 5
>>> matrix_A = [[random.randint(1, 10) for _ in range(matrix_size)] for _ in range(matrix_size)]
>>> print(matrix_A)
[[9, 5, 9, 7, 7], [10, 9, 2, 5, 8], [8, 4, 2, 3, 10], [8, 3, 5, 4, 9], [8, 1, 7, 3, 10]] I want PyObjects references including container objects and non-container objects:
But When previously I was looking at normal with-gil build, I need to hold the GIL and perform the recursion, with no problem. But GIL-held time causes too much overhead to Python application thus I'm looking at NO_GIL. But when not holding the GIL in NO_GIL build, some objects are dealloced by Py main thread, that my separate thread is not aware of, causing seg faults issue by dereferencing invalid addresses. My current logic is added within Modules/gcmodule.c, and it's called from |
@DanielLee343 - you can't traverse all objects while other threads are running. The GC in nogil Python pauses other threads while it is running. If possible, you may be better of intercepting allocations and frees like some memory profilers do. Otherwise, if you want to do this sort of analysis in nogil Python you need to:
Again, to be clear, in nogil Python you need to pause other threads (via the stop-the-world APIs), so that they do not deallocate or mutate objects that you are trying to find.
If you need to modify the runtime for your research that's fine, but the more non-standard things you want to do, the more likely you will run into issues. |
@colesbury It seems I need to block other threads (either by
I mimicked what
What do you mean by I also tried to instrument |
@DanielLee343 - sorry, I forgot that You probably want to instead use That will get you most objects, but if you have multiple threads, and some of them exit, it may miss some objects. You'll also need to visit the abandoned segments. When a thread finishes without freeing all of the memory it allocated, it pushes the in-use segments (data structure containing memory blocks), to a global abandoned segment list to be later claimed by another thread. Memory there isn't "owned" by any thread and not part of any mi_heap, but still contains live objects. You'll need to basically combine the logic of
You can end up with partially destroyed objects. For example, a thread may be in the process of calling an object's |
Hi @colesbury I followed your guide mimicked what mi_heap_visit_blocks does with visit_blocks(...)
{
[...]
allocated_blocks += 1;
// PyObject *op = (PyObject *)block;
// Py_ssize_t cur_refcnt = Py_REFCNT(op); // works fine
uint32_t hotness = op->hotness; // works fine
op->hotness = 0; // seg faults
[...]
} It shows roughly the same amount of objects as what I tested previously, but with much quicker time (which I'm very happy). This PyGILState_STATE gstate = PyGILState_Ensure();
_PyMutex_lock(&_PyRuntime.stoptheworld_mutex);
_PyRuntimeState_StopTheWorld(&_PyRuntime); // needs gil held
_Py_GetAllocatedBlocks_dup(mainState, table);
PyGILState_Release(gstate);
_PyRuntimeState_StartTheWorld(&_PyRuntime);
_PyMutex_unlock(&_PyRuntime.stoptheworld_mutex); And the
Edit: It seems purely reading the field of PyObject works fine, but when I write to it, the main thread segfaults.
caused by |
Hi Sam, I wonder what's the correct C API calling logic to implement a multi-threading feature in this no_gil CPython. I'm doing some hacking within Modules/gcmodule.c, that I want to mimic
gc_get_objects_impl()
but for each GC-traced container PyObject, I further callPyObject_GetIter()
to obtain all it's inner objects references it holds. I face no problem when executing this logic betweentstate = PyGILState_Ensure()
andPyGILState_Release(tstate)
. But apparently it's holding the GIL.If I don't hold the GIL, the
PyObject_GetIter()
internally calls_GC_Malloc()
, and will seg faults inreturn mi_heap_calloc(tstate->heaps[mi_heap_tag_gc], nelem, elsize);
since the heap structure is messed up.Then I noticed on PEP 703, about the thread states. In this no_gil CPython 3.9 version, I guess it would be calling
_PyThreadState_Swap()
to set the thread stateATTACHED
, like this:This
inspect_module_objs()
is called byPyThread_start_new_thread(inspect_module_objs, args);
However, it seg faults at
_PyThreadState_Swap()
since thetstate == NULL
if you don't callPyGILState_Ensure()
. If I hold the GIL before calling_PyThreadState_Swap()
it then leads toPy_FatalError("non-NULL old thread state")
somehow.FYI, originally I asked on python forum here before they told be no_gil in 3.13 main stream is not completed, thus I would like to ask here. Thanks you.
The text was updated successfully, but these errors were encountered: