-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tiny memory leak when using numpy in embedded sub interpreters #5857
Comments
Hi Numpy team, Are there any plans to work on this issue anytime soon? |
Only if you've made some :-). In general there are many places where numpy isn't subinterpreter-safe, and supporting sub-interpreters is not a priority for most people, but if you come up with a fix then we'll certainly take a look. |
Am I right in thinking that every static PyObject is a culprit here? |
If I remember correctly, the main issue was the use of some Python API
(PyGILState?) that's not subinterpreter-safe. Of course, static
pyobjects also matter.
|
I'm not sure if static Pyobjects can leak memory continuously --- once
they're set, they won't be reallocated?
|
A while back I spent some time trying to figure out what it would take to get numpy working with sub-interpreters. From what I can remember the best candidate for memory leaks is static variables that are set during initialization. I thought I had found examples of this in the past but looking through the code today the closest I can get is this, This one isn't a memory leak, it is just leaking a reference to the interned string every time it is called but it shows the type of code that I suspect is causing the leak. The code is assuming that the initialization is happening only once so when another sub-interpreter initializes the module it blindly overwrites the static object that was saved off the first time. In this case it's the same object so there isn't much harm but I am guessing there are other pieces of code making the same assumption with more interesting objects and leaking memory. Most static variables aren't leaking because they are initialized only by the first interpreter that needs them and then never initialized again. While this avoids leaks it causes problems when the interpreter that created the object is destroyed because the interpreter cleans up all loaded modules which leads to some weird state. Usually the result is a "'NoneType' object is not callable" error. That type of error is discussed more in #3961 and it seems to be a separate issue from this one. The issues with PyGILState usually causes deadlock but no memory problems. I think this is discussed in more detail in #5856. This seems to only be a problem when there are multiple sub-interpreters on the same thread so it can be avoided by using separate threads for each interpreter. As mentioned in the python documentation this API is incompatible with sub-interpreters so numpy would have to completely stop using it to fully support sub-interpreters. |
Closing as overcome by events. Jep has come up with a workaround and an alternative to using sub-interpreters with numpy. |
There appears to be a tiny memory leak somewhere in numpy that shows up when using numpy in embedded sub interpreters. I've seen this on RHEL 5 and RHEL 6 in long running server processes, and it appears to be there in all versions of numpy (though it's significantly smaller/slower in newer numpy releases). It's rather challenging to spot but I've written a test case that will illustrate the memory climbing higher continuously. See https://github.com/mrj0/jep/blob/v3.3.0rc/src/jep/test/numpy/TestNumpyMemoryLeak.java
The workaround is to eventually restart the process. If you have trouble running the test case, you may need to configure your environment variables PATH, LD_LIBRARY_PATH, or LD_PRELOAD depending on your system.
The text was updated successfully, but these errors were encountered: