-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closing a Jep instance/sub-interpreter breaks some numpy methods #28
Comments
I'm glad to see the stacktraces are working well. :D This is probably a scipy issue and not a Jep issue. The project I work on uses scipy but not extensively (uses numpy much more), but come to think of it we use thread pools and never close our Jep instances until JVM shutdown so my work project probably dodges this. This sounds quite similar to a problem I found with numpy: numpy/numpy#3961 Jep can't really do much about this kind of thing if the CPython code is messing up internal function pointers, either at python sub-interpreter shutdown or the start of another sub-interpreter, that's in the specific extension's code, not Jep. I documented the one I found in numpy with a bug report and most open source projects are open to contributions of fixes, but there's not a lot of experience with embedded sub-interpreters and most times it would take a significant amount of code changes to correct. (I documented it so it's a known issue, but am not confident enough in numpy to contribute a fix). I should update the Jep docs to mention these kinds of things and how to work around them. Numpy made a label of "embedded" and this appears to be more a numpy problem than a scipy problem, so my recommendation is that you try to figure out what wiped out that method (method appears to be um.logical_or.reduce https://github.com/numpy/numpy/blob/master/numpy/core/_methods.py#L38) and write a ticket for numpy. My hope is that Jep grows in popularity enough that Python communities start giving issues running in embedded sub-interpreters more attention and more developers grow in embedded python expertise. |
Stacktraces are awesome :) Thanks for the hard work. IMHO going with one interpreter per thread, likely using some form of ThreadLocal, is the way to go. Updating C magic in Python libraries feels quite an undertaking. I think this restriction warrants a mention in a visible place in jep docs. |
Sharing an experience I had in a heavily multi-threaded environment using jep in a spark job: A little background: These are the steps I took to get around issues I observed:
The problem was definitely a static variable declaration in ECOS C code. I've reported the issue to both cvxpy and ECOS and I think they are going to solve it in next releases. embotech/ecos#127 |
Thanks for sharing your insight. That's great that ecos and cvxpy are looking into solving the issues! I'll make sure to include these workarounds in the docs. |
I've traced it through the cpython and numpy code and I think I understand the problem. When the jep interpreter is closed we call Py_EndInterpreter which calls Py_ImportCleanup calling _PyModule_Clear which finally ends up setting all the module items to None. The root of the problem is in the numpy c code where it imports the _mehods module into c and save methods to a static variables. Since the variable um is part of the module, it is set to None as part of _PyModule_Clear. The 'any' function itself is still held in the static variable but when the function is called it looks up the value for the variable 'um' which is None which has no logical_or attribute causing the exception. This is not something we can fix in jep since the problem is in a static variable created by numpy. A naive solution would be modify _methods.py and move the import of the umath module into the any method so that it did not need to reference any variables defined in the module. However in newer versions of numpy they did the exact opposite, moving the logical_or.reduce into a variable. It seems this was a performance enhancement so if we undid that and moved the import within the method it would slow things down slightly. There is probably some other way to access or cache that method but I'm not familiar enough with numpy source to recommend a solution and any further discussion should probably head over to the numpy page to get input from the devs there. The minimum code required to replicate the issue from java is: Jep jep = new Jep();
jep.eval("import numpy");
jep.eval("numpy.ndarray([1]).any()");
jep.close();
jep = new Jep();
jep.eval("import numpy");
jep.eval("numpy.ndarray([1]).any()");
jep.close(); To demonstrate that this issue is not specific to jep, but happens anytime numpy is used from subinterpreters the following code also shows the problem. int main( int argc, const char* argv[] ){
PyThreadState *mainState, *subState;
Py_Initialize();
PyEval_InitThreads();
mainState = PyThreadState_Get();
PyEval_ReleaseThread(mainState);
subState = Py_NewInterpreter();
PyRun_SimpleString("import numpy");
PyRun_SimpleString("numpy.ndarray([1]).any()");
Py_EndInterpreter(subState);
subState = Py_NewInterpreter();
PyRun_SimpleString("import numpy");
PyRun_SimpleString("numpy.ndarray([1]).any()");
Py_EndInterpreter(subState);
PyEval_AcquireThread(mainState);
Py_Finalize();
} |
On master branch I have added a new test with Ben's code to more easily illustrate the problem. https://github.com/mrj0/jep/blob/master/src/jep/test/numpy/TestNumpyAny.java So it's not scipy or cvxpy, it's just numpy and how the references to variables are retained. I'm going to close #31 as a duplicate of this ticket, and then we'll have to work with the numpy developers to determine if there is an optimal way to overcome this without adversely affecting numpy. |
Hi. I also hit this problem, stack trace below. This occured when shutting down Jep instances with close command after multiple threads have finished their work.... jep.JepException: <type 'exceptions.TypeError'>: 'NoneType' object is not callable |
This is alleviated by the new shared modules feature in Jep 3.6 which has a release candidate and will be released in the near future. It needs tested in more environments than where Ben and I have tested it. See the 3.6 release notes: https://github.com/mrj0/jep/blob/dev_3.6/release_notes/3.6-notes.rst#python-shared-modules-beta I will update the wiki after the official release and hopefully we will have some good feedback. |
Since 3.6 is released with shared modules, I'm closing this ticket. |
I'm using jep3.4.1 on Mac in a Scala environment, with scipy. If I try to initialize multiple jep instances and import scipy, I get an error the second time. I'm using Scala, but Java code should be very similar:
The Python stack trace is:
As a workaround, I'm just going to reuse a singleton jep instance.
The text was updated successfully, but these errors were encountered: