-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault due to pydrake Simulator.reset_context #12811
Comments
Apologies for the delay in responding. The on-call issue triage last week fell through. It is definitely a bug when pydrake segfaults, even if the APIs were used incorrectly. However, I am unable to reproduce your problem. I tried using https://github.com/RobotLocomotion/drake/releases/tag/v0.15.0 on Ubuntu 18.04 with the following cobbled-together code: import numpy as np
from pydrake.common import FindResourceOrThrow
from pydrake.examples.manipulation_station import ManipulationStation
from pydrake.math import RigidTransform
from pydrake.multibody.plant import MultibodyPlant
from pydrake.multibody.parsing import Parser
from pydrake.systems.analysis import Simulator
station = ManipulationStation()
parser = Parser(station.get_mutable_multibody_plant(),
station.get_mutable_scene_graph())
plant = station.get_mutable_multibody_plant()
# Add models for iiwa and wsg
iiwa_model_file = FindResourceOrThrow(
"drake/manipulation/models/iiwa_description/iiwa7/"
"iiwa7_no_collision.sdf")
iiwa = parser.AddModelFromFile(iiwa_model_file, "iiwa")
X_WI = RigidTransform.Identity()
plant.WeldFrames(plant.world_frame(),
plant.GetFrameByName("iiwa_link_0", iiwa),
X_WI)
wsg_model_file = FindResourceOrThrow(
"drake/manipulation/models/wsg_50_description/sdf/"
"schunk_wsg_50.sdf")
wsg = parser.AddModelFromFile(wsg_model_file, "gripper")
X_7G = RigidTransform.Identity()
plant.WeldFrames(
plant.GetFrameByName("iiwa_link_7", iiwa),
plant.GetFrameByName("body", wsg),
X_7G)
# Register models for the controller.
station.RegisterIiwaControllerModel(
iiwa_model_file, iiwa, plant.world_frame(),
plant.GetFrameByName("iiwa_link_0", iiwa), X_WI)
station.RegisterWsgControllerModel(
wsg_model_file, wsg,
plant.GetFrameByName("iiwa_link_7", iiwa),
plant.GetFrameByName("body", wsg), X_7G)
# Finalize
station.Finalize()
assert station.num_iiwa_joints() == 7
# Simulate.
zero = [0.0] * 7
simulator = Simulator(station)
context = simulator.get_mutable_context()
station.GetInputPort("iiwa_position").FixValue(context, zero)
station.GetInputPort("iiwa_feedforward_torque").FixValue(context, zero)
station.GetInputPort("wsg_position").FixValue(context, 0.);
station.GetInputPort("wsg_force_limit").FixValue(context, 40.);
simulator.AdvanceTo(0.1)
cloned_context = simulator.get_mutable_context().Clone()
simulator.AdvanceTo(0.2)
simulator.reset_context(cloned_context) jwnimmer@cons:~/tmp$ PYTHONPATH=$HOME/Downloads/drake-20200212-bionic/drake/lib/python3.6/site-packages python3 test.py; echo $?
[2020-03-08 20:19:51.615] [console] [warning] Currently MultibodyPlant does not handle joint limits for continuous models. However some joints do specify limits. Consider setting a non-zero time step in the MultibodyPlant constructor; this will put MultibodyPlant in discrete-time mode, which does support joint limits.
[2020-03-08 20:19:51.615] [console] [warning] Joints that specify limits are: `left_finger_sliding_joint`, `right_finger_sliding_joint`.
[2020-03-08 20:19:51.618] [console] [warning] Currently MultibodyPlant does not handle joint limits for continuous models. However some joints do specify limits. Consider setting a non-zero time step in the MultibodyPlant constructor; this will put MultibodyPlant in discrete-time mode, which does support joint limits.
[2020-03-08 20:19:51.618] [console] [warning] Joints that specify limits are: `iiwa_joint_1`, `iiwa_joint_2`, `iiwa_joint_3`, `iiwa_joint_4`, `iiwa_joint_5`, `iiwa_joint_6`, `iiwa_joint_7`.
0 Are you able to post steps that I can run to reproduce the problem? See https://www.chiark.greenend.org.uk/~sgtatham/bugs.html for good tips on reporting an actionable bug. |
Thanks for the timely reply! Sorry for the confusion. As seen in the iiwa_drake.py, you should see that AdvanceTo will run OK, but moving the arm will cause segmentation fault. |
The code is extended based on the examples from MIT 6.881 Intelligent Robotic Manipulation class. The core piece is to use IK to plan for motion and move the arm. I wrote a few wrappers to make the code a little cleaner, but that is about the only change I have added. |
Thanks! The |
Sorry for the error, I have pushed the fix. Or you can simply remove deepPHA.drakesim.manip_station_sim. before the robot_plans, because it is just the relative path from the root of the package. |
Ok, the example needs a newer scipy that what's on Ubuntu 18.04, but with some fiddling I was able to obtain an error message: $ python3 -m venv env
$ source env/bin/activate
$ pip install scipy matplotlib umsgpack zmq ipython pyyaml
$ env PYTHONPATH=/home/jwnimmer/Downloads/drake-20200212-bionic/drake/lib/python3.6/site-packages python3 iiwa_drake.py
...
[2020-03-11 20:43:46.148] [console] [warning] Joints that specify limits are: `iiwa_joint_1`, `iiwa_joint_2`, `iiwa_joint_3`, `iiwa_joint_4`, `iiwa_joint_5`, `iiwa_joint_6`, `iiwa_joint_7`.
works fine before resetting the context
now reset the context
works fine for simple AdvanceTo with no commands
segmentation fault if arm is commanded to move again
Failure at systems/framework/cache.cc:12 in GetPathDescription(): condition 'owning_subcontext_!= nullptr' failed. That's with the 0.15.0 release. With 0.16.0 release, I do see a segfault: $ env PYTHONPATH=/home/jwnimmer/Downloads/drake-20200311-bionic/drake/lib/python3.6/site-packages python3 iiwa_drake.py
...
[2020-03-11 20:49:09.926] [console] [warning] Joints that specify limits are: `iiwa_joint_1`, `iiwa_joint_2`, `iiwa_joint_3`, `iiwa_joint_4`, `iiwa_joint_5`, `iiwa_joint_6`, `iiwa_joint_7`.
works fine before resetting the context
now reset the context
works fine for simple AdvanceTo with no commands
segmentation fault if arm is commanded to move again
Segmentation fault (core dumped) I'll see what I can learn about the problem. |
Using https://drake.mit.edu/python_bindings.html#debugging-with-the-python-bindings instructions and adding in trace, we get:
That points us in a good direction, I think. I suspect that part of the problem is that in simulator = Simulator(self.diagram)
self.simulator = simulator
context = self.diagram.GetMutableSubsystemContext(
self.station, simulator.get_mutable_context())
self.station_context = context
self.plant_context = self.diagram.GetMutableSubsystemContext(
self.plant, self.simulator.get_mutable_context()) However, later when we The import point is that Here's one suggestion for a fix. The method Context.SetTimeStateAndParametersFrom is able to copy data from one Context to another. So instead of Alternatively, you could make sure that all references to sub-contexts are always pointing into the simulator's context. So every call to There is still a bug in pydrake somewhere, though -- accessing the original context's subcontext values in |
Oh, of course. A call to We even have a scary unit test about this: drake/bindings/pydrake/systems/test/general_test.py Lines 456 to 464 in b2293bc
Given that, I don't even know why we have a binding for @EricCousineau-TRI what do you think about removing that binding? It seems like methods accepting a unique_ptr should generally not be bound using ownership-transfer semantics, since it is super easy to violate the reference counting. |
Oh, we added
Sounds familiar! At minimum, if we keep this method around, we need to amend its pydrake doc to explain why it is scary and explain how to use it safely. (Or really, change the binding to use shared_ptr, not unique_ptr, if we must have it.) But if |
WIP of supporting A good next step might be to add a unit test for the |
Just to check, how are the |
If I run the pydrake unit tests in my branch, I see errors that I'm guessing are because I haven't articulated https://pybind11.readthedocs.io/en/stable/advanced/smart_ptrs.html#std-shared-ptr quite correctly, or possibly missing |
Ah, yeah, if those are runtime errors, then yes, it is most likely b/c you need to change the py cls declaration to |
Basically, if any class wants to be used via |
Yup, trying to get shared_ptr working for Context in pydrake is too difficult. My new opinion is that we should keep the |
Do we want to take time to revisit whether (Granted, the overhaul to change the API is probably super high cost at the moment, and less then episodically doing the raw-pointer setup.) |
I don't see how runtime penalty plays any part in these trade-offs. The rationale for the current C++ APIs is not that. I think the way to stomp out these issues globally is to disallow binding |
I don't think every one is wrong. It's only wrong when you're able to delete the object in C++ explicitly (e.g. via
Can I ask what the rationale was? Was it purely following GSG, or do we like the exclusive-ownership w/o having to do some sort of registration setup? (If it's the exclusive ownership, it feels like |
I do recall the runtime cost of shared_ptr as being one of the factors in our banning them from common use in Drake. But I think the free-for-all ownership was a bigger factor. |
I am going to make a broader-scoped issue, since that is where this is headed. |
Unassigning myself for now (it's easier to tell responsibility when there's only one owner) |
I am having trouble resetting the context of my simulator object to a different one (potentially from another time step)
Following the documentation of pydrake, it seems the right way is the following
When I continue to run this, I will receive a segmentation fault error with no exception.
My platform:
latest pydrake binary on Mac (downloaded from drake.mit.edu, not local builds)
MacOS Mojave 10.14.6
My particular problem instance deals with the ManipulationStation class, with a set of custom manipulands and initial condition, all loaded via the AddManipulandFromFile function.
The text was updated successfully, but these errors were encountered: