-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-106176: Fix reference leak in importlib/_bootstrap.py #106207
Conversation
There's no tests for case which solved by #94504, so I'm somewhat unsure if this is the right solution However, refleaks in |
@exarkun any opinions on this fix? |
I think we are almost here! sys.modules["_frozen_importlib"]._blocking_on� is the root of the leak. |
As for the fix, we usually clean up things like these within the test case and leave the cache as is. |
🤖 New build scheduled with the buildbot fleet by @sunmy2019 for commit 5328e32 🤖 If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again. |
Yeah, you're right. The refleaks is happen in |
Confirmed fixed on Windows |
How about this? diff --git a/Lib/test/libregrtest/utils.py b/Lib/test/libregrtest/utils.py
index fd46819fd90..c5f301c2d18 100644
--- a/Lib/test/libregrtest/utils.py
+++ b/Lib/test/libregrtest/utils.py
@@ -125,6 +125,15 @@ def clear_caches():
if stream is not None:
stream.flush()
+ try:
+ _frozen_importlib = sys.modules['_frozen_importlib']
+ except KeyError:
+ pass
+ else:
+ _frozen_importlib._blocking_on = {
+ k: v for k, v in _frozen_importlib._blocking_on.items() if v
+ }
+
try:
re = sys.modules['re']
except KeyError: |
I would prefer to not leak so much implementation details in libregrtest. If importlib cannot clean that automatically and it must be cleared, maybe add a private function for that. But i don't see why all tests must not clear that dict if only a test is affected. For me it would make sense to clear that dict in test_import and test_importlib, in a tearDownModule() function or something like that. |
This leak also present in |
I don't understand how a pickle tests leaks a deep internal importlib "blocking" object? Why is this "blocking" object created? Why is not deleted automatically? The Refleaks test checks that each test iteration either creates no new objects or deletes all objects that it created. |
Let me try to explain. Previously, structure of After changes, it took this form: Why test_pickle leaks? In
At the current moment of research, I think solution which presented in this PR is the only true. |
I guess that by "dying", you just mean that the thread exits cleanly. Can this "per-thread" dictionary becomes a thread local storage, like Or can you register a function with A practical problem is that the threading.py module is not imported yet when _bootstrap is run :-( For me, this memory leak is a real issue. If an application spawns many threads, each thread creates an item in Prototype using thread local: import threading
import _thread
import tracemalloc
NTHREAD = 10
def big_alloc():
return bytearray(1024 * 50)
if 1:
def mylock():
thread_locals = _thread._local()
thread_locals.blocks_on = big_alloc()
else:
_blocks_on = {}
def mylock():
_blocks_on[_thread.get_ident()] = big_alloc()
def test():
threads = [threading.Thread(target=mylock, args=()) for _ in range(NTHREAD)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
threads = None
# warmup to have a more accurate memory usage measurement
test()
tracemalloc.start()
before = tracemalloc.get_traced_memory()[0]
test()
after = tracemalloc.get_traced_memory()[0]
usage = after - before
print(f"usage: {usage} bytes") It shows less than 1 kB of memory usage: no leak. If you replace |
I don't have a specific opinion on this, but do note you do have to make sure
It should mean the thread exited while in the middle of an import, else there's a logic error causing keys to get left behind in the dict even when all imports resolve. |
|
That SGTM! |
It's just what this PR has done. |
@@ -85,6 +85,8 @@ def __enter__(self): | |||
def __exit__(self, *args, **kwargs): | |||
"""Remove self.lock from this thread's _blocking_on list.""" | |||
self.blocked_on.remove(self.lock) | |||
if not self.blocked_on: | |||
del _blocking_on[self.thread_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just have one question, is this always true?
assert self.blocked_on is _blocking_on[self.thread_id]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, if it were sometimes is not true, self.blocked_on.remove(...)
will fail, no?
Though, currently test suite didn't failed with new assert check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be True under certain circumstances.
Actually, I think |
ISTM, that this problem has been tried before in a similar way, but then got reverted: |
Oh. That problem which looks simple seems quite complicated to be fixed. |
Unfortunately, seems no. |
As Guido says, |
No. Either deleting the list inside
|
The key point is that def remove(wr, selfref=ref(self), _atomic_removal=_remove_dead_weakref):
self = selfref()
if self is not None:
if self._iterating:
self._pending_removals.append(wr.key)
else:
# Atomic removal is necessary since this function
# can be called asynchronously by the GC
_atomic_removal(self.data, wr.key) static PyObject *
_weakref__remove_dead_weakref_impl(PyObject *module, PyObject *dct,
PyObject *key)
/*[clinic end generated code: output=d9ff53061fcb875c input=19fc91f257f96a1d]*/
{
if (_PyDict_DelItemIf(dct, key, is_dead_weakref) < 0) {
if (PyErr_ExceptionMatches(PyExc_KeyError))
/* This function is meant to allow safe weak-value dicts
with GC in another thread (see issue #28427), so it's
ok if the key doesn't exist anymore.
*/
PyErr_Clear();
else
return NULL;
}
Py_RETURN_NONE;
} We can provide similar things inside some C module. |
importlib._bootstrap can use |
You have to be sure to do it thread-safely and re-entrant-safely or the original bug is re-introduced. |
I was able to implement a |
If you're asking if you can use it at all, then the answer is "yes, you can use it" since |
Another option is if people can come up with a different solution to the import deadlock/race condition problem. It's obviously rather tricky (thanks, threads), but maybe someone else has another way to approach the problem? |
We can also revert the fix if this continues to block 3.12.0rc1 and rethink how to tackle the threaded import issue. |
The bug fixed caused real Python programs to crash in multiple environments in mysterious ways due to unpredictable interactions between the import system and third-party modules. The problem introduced by the fix seems to be a leak of one dictionary entry per thread that performs an import and then exits. Some Python programs might create hundreds of thousands or millions of unique threads over their lifetime and might see noticable memory usage as a result - but at least they won't crash when one of those threads gets unlucky and enters the importlib machinery at just the wrong time. |
I'm aware it fixed actual issues (I did the code review and merge of your PR), but the issue isn't something a large portion of people run into. And considering it wasn't (potentially) fixed until Python 3.12, I would argue it wasn't crippling either.
And the release manager for Python 3.12 is blocking the release due to this memory leak, which I consider the more critical issue. As I said, I would love if someone can come up with a fix for the refleak and keep the fix, but if it's a choice between keeping the fix or stopping the refleak, the RM has chosen the latter as the more critical thing at the moment. |
This is news to me. Who is the release manager and where can I read about their concerns? |
It's in the issue attached to this PR (you actually commented on the issue about how you didn't have time, so maybe you unsubscribed?). |
Is it possible to temporarily disable gc when deleting this empty list? |
How does that help with the memory staying around? The key issue is making sure appropriately cleanup occurs regardless of what eventually happens to the thread as there end up being dangling objects. |
Eeh, I would like to clarify that I have not taken a stance on whether the original bug or the leak is more important. What does matter is that test_import currently leaks references, and that is a blocker. The leak in test_import can be fixed in different ways. I also don't think accumulating a new, effectively immortal object per thread is a good thing to do, but that in itself isn't blocking 3.12rc1. |
#108497 has been merged, so let's close this. |
test_import.test_concurrency
leaks ref on Windows #106176