-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Program aborts when Python's garbage collector gets called from another thread and attempts to traverse an unsendable pyclass instance. #3688
Comments
Hmm. This is unfortunate, but not entirely a surprise. At least we crash safely.
I disagree that this deduction is incorrect. From PyO3's perspective this is true; the data is being read on another thread, which violates the One option could be to make |
I was afraid you'd say that! Unfortunate indeed.
I suppose that's fair. However the issue only arises in the fairly niche case where a GC call from another thread happens to occur while there are unsendable GC-integrated objects in a reference cycle waiting to be collected, so I'm not sure whether it would be a worthy motivation for removing functionality that works fine in most cases. But maybe it is? I did just think of another possible solution; see this github.dev link. Apparently the |
One other option I see is, instead of an error, to make unsendable pyclasses "invisible" to the GC when it is running on a different thread, i.e. turn |
I making them opaque to other threads is quite a reasonable option, we can also document this caveat as part of the offering of That said, I think it's possible that these things might still get collected by another thread running a GC collection? E.g. if the unsendable class itself does not directly contain the cycle but is referenced from an object that does participate in a cycle. Then when the cycle gets collected, the unsendable class gets dropped by the wrong thread. IIRC we leak and warn in this situation already, as per #3176, so I think this edge case is ok but unfortunate. (The only solution I can see to mitigate that would be to have a per-thread queue so that unsendable classes could post themselves to their owning thread instead of leaking, but I'm not sure that it's worth the complexity.) |
Will prepare a PR to turn
I think we should definitely try to reduce global state in PyO3, we already have quite to much and I would like to avoid adding more. If something like this is desired, I would prefer to have that in downstream code which actually how threading is used. |
Agreed very much so on that point 👍 |
Wow that was fast, thank you for your effort! I like that solution and implementation, great work guys |
So this means you tested your PoC using the proposed change and it worked as expected? |
I have created a repository providing a full breakdown and minimal reproducible example of the error
at https://github.com/JRRudy1/pyo3_gc_error. I will provide a summary below, but please check out
the repository instead as I put a lot of effort into clearly presenting and investigating the issue.
In summary, I have discovered an error, or perhaps an undocumented limitation, in the way
PyO3 handles thread-checking for "unsendable"
pyclass
instances as they are being traversedby Python's garbage collector (GC). In particular, this occurs when garbage collection is triggered
from a separate thread, and the pyclasses integrate with the GC by implementing the
__traverse__
magic method. The error (or limitation) results in a hard abort, and is particularly problematic
since it cannot be caught from Python using a
try
/except
block.The conditions and sequence of events leading to the error can be summarized as:
__traverse__
/__clear__
to break itgc.collect
fromPython or
GcCollect
from C)not the original thread and incorrectly deduces that the object was sent between threads
I have gotten reasonably familiar with PyO3's internals and may be interested in working on this,
but I would need some guidance from an "expert" with a more nuanced understanding of the
possible implications. It is possible that the limitation cannot be safely fixed, and the only solution
is to improve the error message and add a warning to the documentation.
As mentioned above, please visit https://github.com/JRRudy1/pyo3_gc_error for more information.
The text was updated successfully, but these errors were encountered: