-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV using jep from specs2 in sbt - embedded both Python (jep) and R (rpy2) #126
Comments
I suspect you never called Jep.close(). |
Called under finally; see second Does something like that make sense? |
I'm not sure what you mean, What is the result of your printlns? Is close() getting called on the same thread as creation. Do you see any of the warnings that are here or here? If you can include the java frames and native frames from the hs_err_pid file that might be helpful. |
Yes, same thread. First run in sbt
second run
I put another From the PID log file. (see more info after)
Interestingly all is well when specs are run from Intellij (not sbt). Now, if I use
I get this even in Intellij: note
Note the "Error:Trying to release object ID" above ! In the PID log I get this
So I'd say something is wrong with how the Many thanks in advance for reading this far. One more: to shed light on internal memory management (which seems to be causing this) if I initialize embedded R explicitly in Python like this :
Then from Intellij I get
So the second run does not happen because ... the second This all seems to point to improper state management and cleanup in |
Alternative experiment to troubleshoot using insights from #28 (comment) If I use a python script like this
then same thing happens in sbt: crashes at second run
If I use a combination of the above then I get this from inside Intellij: "mpd_setminalloc: ignoring request "
Same JVM and settings were used in all cases. |
Did some research into potentially related issues
In my case, likely
https://github.com/ninia/jep/wiki/How-Jep-Works#sandboxed-interpreters
Is this on the right direction ? |
Introduced use of shared modules and this has improved stability
Now, a combination of the above involving In sbt interactive session still happens on the second run.
|
When you start mixing in numpy and rpy2 it is not surprising that you can create conditions that crash. Jep is built off of python sub-interpreters which are a rarely used feature of cpython so extension writers often write code that is not compatible with sub-interpreters. There is not much we can do about this from the jep, we have open issues with numpy but it is fairly complex to fix them and there may be performance side affects so there is no activity to fix them. Our current workaround is the introduction of shared modules and in the next release of jep(3.8) I have added an option to use jep without sub-interpreters which should also help alleviate this problem but if you get the configuration wrong with native python extensions it can crash. What was most interesting in the original post was that you managed to crash without using much python at all. I don't know how sbt works, but I think you were on the right track when you mentioned sbt keeping the python binaries around. I suspect that sbt is using the same process for both runs, however it attempted to unload the Jep class and reload it which causes it to reinitialize the python interpreter. So the first question is if you can verify this? Can you print out the process ID or something to verify that both runs are in the same process? In theory Jep attempts to support this use case, when the first jep class is unloaded the JVM should call JNI_OnUnload, which calls pyembed_shutdown, which should completely reset the python state. However I have never managed to get the JVM to actually unload in my testing. I suspect that the JVM cannot unload the Jep class because the TopInterpreter Thread is still running. Looking at your hs_err_pid, I suspect that I suspect this problem may be resolved if you use forked JVMs in your SBT configuration, reading some documentation it looks like the Jep background thread is exactly the type of use case that needs forking. It would still be good if we could fix this within Jep so please post any other information you find, but it would not be trivial if we really do need to handle jep class unloading so forking the JVM may be the most practical option. |
" managed to crash without using much python " - I believe it was from Until I started using It would be great of @lgautier had a look at this since he is master of running R embedded (so he might help with running Python embedded). side note: At some point I had a script using
|
What is the ETA please? |
Wip investigating with forking but no joy yet. |
The code to run without subinterpreters is already in the dev_3.8 branch if you want to try it out, see here. The official release of 3.8 will probably happen sometime around June or July. |
Forking of tests in sbt project works and Run
Run
You can see that for each run the PID is different but within each run it is same: the Specs2 code for all 3 tests I have run in the same process. side note: This is important because I need them to share the same My end goal is to use them in a So, again, I do not mind a JVM is spawn/fork for each spec but wanted to make sure all tests in a spec share the same JVM/process (state). Interesting how thread Thoughts? |
Found another way to break it
Inspired from something Nate did. UPDATE: SIGSEGV happens even if I comment out the second import, at the next line, same "problematic frame". This may hint at the fact that the import from previous jep instance is still around. Is this one of the know Have been able to share a jep instance from the same thread between Python scripts calling R through
|
My understanding is that C-extensions are shared across the sub-interpreters, and because of this the exact way to develop with sub-interpreters in mind is a non-trivial affair (and an improvement the subject of a PEP under discussion - https://www.python.org/dev/peps/pep-0554/). |
|
If I have multiple Python scripts being executed at different times in the (same) |
If the C-extension is shared as (I understand) the doc tells it, calling You can check it easily, as rpy2.rinterface._rinterface.is_initialized() However, calling |
If you close an interpreter using numpy and you are not using shared modules then there are many ways to crash or generate errors. This is caused by numpy storing static references to python objects, there is nothing we can do in jep so you don't need to report it here. |
The complexity depends on what you are trying to do, but I agree for many things it may be non-trivial. Most of my experience has been digging in numpy which has problems because they keep static references in c to python objects. When a python interpreter is closed these objects become invalid and fail if numpy is used again. PEP-3121 was intended to provide extension writers away around this particular problem but it isn't widely used. Pep-554 is interesting so I'll be curious to see how that changes things in the future. |
Understood and am using shared modules. At this time this code does not cause SIGSEGV.
Will test more closing |
Just for the record - I experienced the same issue with numpy - numpy/numpy#13051 (comment) - I got a response but appears that it is a low priority. |
I am closing this old issue because it is specific to a very specific environment and there have been many changes to the jep, python, and numpy since we last looked at this. I do not see any changes we need to make in jep in this discussion. If I am mistaken and we need jep changes please open a new issue. |
Getting this from application test code.
Using
specs2
to runjep
which calls a simple python script.From sbt, first time it runs ok but second time it cashes with segmentation violation.
Any thoughts please @ndjensen ?
It feels like sbt jvm holds the spec class loaded and this, in turn, holds the python binaries ... hanging around.
I use something like
and also set
LD_LIBRARY_PATH
.Thought it might be related to this #22
Reused code from https://github.com/A-OK/Snakes-and-Ladders/blob/master/scala_hosting_python/jepp/a_la_spiewak/Scalathon.scala
The text was updated successfully, but these errors were encountered: