-
-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
numpy.set_string_function is unsafe when using multiple embedded sub interpreters #3961
Comments
Numpy cannot in practice be used in multiple interpreters; leads to |
We're using Jep for multiple embedded interpreters, not mod_wsgi. It's worked fine for 5 years except for this problem with array2string. |
I don't think we have any objection to patches that improve handling of the
|
IIRC, there are a couple of spots where numpy can change array internals on the fly, generally on module load. That's probably a bad idea, but would require significant work to clean up. |
Ok, I've finally got the code merged and can provide test code that illustrates the problem. Requires Jep and Java but here is the code. https://github.com/mrj0/jep/blob/v3.3.0rc/src/jep/test/numpy/TestNumpyArrayToString.java If you have fatal python errors when running it, you may need to set environment variables of PATH, LD_LIBRARY_PATH, and/or LD_PRELOAD depending on your system. |
I have isolated the problem a little and created a pure c test case that replicates the issue in a single thread using multiple subinterpreters: int main( int argc, const char* argv[] ){
PyThreadState *mainState, *subState1, *subState2;
Py_Initialize();
PyEval_InitThreads();
mainState = PyThreadState_Get();
PyEval_ReleaseThread(mainState);
subState1 = Py_NewInterpreter();
PyRun_SimpleString("import numpy");
PyEval_ReleaseThread(subState1);
subState2 = Py_NewInterpreter();
PyRun_SimpleString("import numpy");
Py_EndInterpreter(subState2);
PyEval_AcquireThread(subState1);
PyRun_SimpleString("str(numpy.ndarray([1]))");
Py_EndInterpreter(subState1);
PyEval_AcquireThread(mainState);
Py_Finalize();
} |
Ben traced deep into a similar problem with the |
Thanks for the update. FWIW my best advice would be to stop using subinterpreters. They've never been supported well by the python interpreter, and even today there still isn't any clear guidance for authors of python extensions about what needs to be done to handle them correctly. I doubt it is even possible to make numpy fully correct in the subinterpreter case. Assuming you aren't going to follow that advice :-), I think the comments above about accepting low impact patches still stand. |
Closing as overcome by events. Jep has come up with a workaround and an alternative to using sub-interpreters with numpy. |
Our application uses multiple embedded python interpreters. When starting or stopping new interpreters, everything with numpy seems to work fine except for the printing of arrays. After starting a second interpreter that uses numpy and then printing an array in the first interpreter, we get an error:
I've traced it down somewhat. Another embedded interpreter does not share sys.modules but any static variables used in CPython extensions are shared across interpreters. It seems to be that when the second interpreter imports numpy it imports numeric.py which has these two lines:
set_string_function(array_str, 0)
set_string_function(array_repr, 1)
That changes the static variables PyArray_StrFunction and PyArrayReprFunction in arrayobject.c that are shared across interpreters, and the first interpreter somehow still references the original PyArray_StrFunction and therefore fails on a print. Note that if the first interpreter directly uses numpy.core.arrayprint.array2string or numpy.core.numeric.array_str those still work fine, it only seems to be the str(array) that is broken.
Tested with python 2.7.1 and numpy 1.7.1 and numpy 1.5.0 on CentOS 5.
The text was updated successfully, but these errors were encountered: