-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot cleanly restart a Child task without restarting autopilot #168
Comments
success?? I think that adding along with the aforementioned It seems perhaps there are some dangling sockets left open, and this is causing problems on subsequent session starts. I'll continue testing and if this works I would probably suggest adding |
Awesome! yes i have been super sloppy with closing stuff, agree this is super annoying, will revisit after this current release |
Figured out a reliable way to kill the IOLoop objects that keep the networking objects open. It's sort of a careful dance of trying to hold threads and processes open with a blocking self.loop.add_callback(lambda:IOLoop.instance().stop()) you can reliably kill them. You're right, also need to close the sockets. I'll work on cleaning this up in v0.5.0, which i'm preparing now. |
a few other things I discovered that seem to be important for cleanly shutting down, at least over here. I haven't extensively tested them yet, but I'm just mentioning it now, in case it rings a bell. Now I can cleanly shut down the terminal, and I never have to restart the autopilot process on the Pi between sessions, which wasn't the case for me before in our setup. I can test these out and make PRs after 0.5.0 too.
|
yes i just arrived at the same changes ;). working my way through some more of the eternal technical debt, thanks for the pointers as well, very helpful. |
For a while I've been using a Parent Pi connected to multiple Child Pi (as discussed here: #101). This works, but with the annoying problem that the autopilot process on each Child has to be manually quit and restarted between every session, or otherwise we get a ZMQError. I think the Child is not stopping correctly.
Full code for a minimal example (but see also relevant excerpts below if that is easier):
The Parent task: https://github.com/Rodgers-PAC-Lab/autopilot/blob/paft2022/autopilot/tasks/paft_parent_child.py
The Child task: https://github.com/Rodgers-PAC-Lab/autopilot/blob/b2105803b7bd4e36e39ac1a6f662ec93f0c85c96/autopilot/tasks/children.py#L35
Here is an excerpt of the code in init for the Parent task. The first Net_Node tells the Child to start, and the second Net_Node is used for bidirectional communication with the Child.
Here is the code in the init for the Child task. This Net_Node is used to communicate with the Net_Node of the same name in the Parent.
All of the above actually works! However, if we stop the session and start a new one, then the Parent generates an exception when we try to create
self.node2
again:zmq.error.ZMQError: Address already in use
I've come up with two semi-fixes, neither of which really fixes the whole problem. First, to fix ZMQError, I have this line
self.node2.router.close()
in theend()
function of the Parent task. After this fix, the ZMQError stops happening. Note that it's not sufficient to haveself.node2.release()
, so perhaps closing the router needs to be added as a step in releasing a Net_Node.Second, I explicitly tell the Child class to end, by adding
self.node.send(to=prefs.get('NAME'), key='CHILD', value={'KEY': 'STOP'})
in theend
function of the Parent task. Does this look right? I think this is probably a good thing to do, though I don't see it in the example GoNoGo/Wheel_Child code.Nonetheless, even after these fixes, I haven't yet succeeded in starting a Child task again, without rebooting the autopilot process. Any tips would be much appreciated!!
PS - I think we could also test/debug this in the GoNoGo/Wheel_Child standard task, but that one is not running for me, not sure if I am missing some hardware or what. It says something about a KeyError 'F' in initting some gpio pins.
The text was updated successfully, but these errors were encountered: