-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race with launch context changes when loading composable nodes #166
Conversation
This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a regression test?
@jacobperron Does this fix #114? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix and +1 to having a regression test.
Will do.
I had completely forgotten about that issue. I'll check to see if it fixes it. |
So, it appears this PR fixes the example mentioned in #114 (comment), which involves setting the namespace, but not the original issue that involves setting parameters on the node. I'll look into if a similar fix to the one proposed in this PR can be used to fix #114. |
That's weird, because |
Is there a difference between "global" params (aka launch config |
hmm, but it's not the parameters that are being dropped, rather a user-defined launch configuration "bar_arg". I can do some more debugging. |
Signed-off-by: Jacob Perron <jacob@openrobotics.org>
I've added some regression tests in 719ced8. There's still a bug related to parameters (and some other corner-cases); I'll open up separate PRs for those. |
test_launch_ros/test/test_launch_ros/actions/test_load_composable_nodes.py
Outdated
Show resolved
Hide resolved
test_launch_ros/test/test_launch_ros/actions/test_load_composable_nodes.py
Outdated
Show resolved
Hide resolved
assert get_node_name_count(context, f'/{TEST_CONTAINER_NAME}') == 1 | ||
|
||
# Check the remaps in load service request | ||
assert len(load_composable_nodes_action._LoadComposableNodes__load_node_requests) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the sent request could be a property, so we don't have to check a private member variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed a similar pattern that is used in test_node.py.
I think a user should not have access to that information; it seems like an implementation detail subject to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed a similar pattern that is used in test_node.py.
Yeah, it's not a great pattern, but acceptable I guess.
I think a user should have access to that information; it seems like an implementation detail subject to change.
That sounds contradictory, I guess you meant "should not".
I prefer not testing implementation details, but I don't feel strongly in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, I meant "should not" (updated my comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the ideal way to write tests in this situation would be to write a mock component container and assert things about it (e.g. if the correct service request was received). It adds a bit more to the test complexity, but could be better long-term. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think?
Sounds like a good proposal to me, though I don't mind approving this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and refactored the tests to use a mock object. I think it's much nicer. PTAL 5d9dcf3
Signed-off-by: Jacob Perron <jacob@openrobotics.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Jacob Perron <jacob@openrobotics.org>
This was only needed for the tests before the refactor. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
assert request.parameters[0].value.type == ParameterType.PARAMETER_STRING | ||
assert request.parameters[0].value.string_value == 'test_value_param1' | ||
assert request.parameters[1].name == 'test.param2' | ||
# TODO(jacobperron): I would expect this to be a double value, but we can only pass strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look at resolving this in a follow-up.
ugh, there's an issue with the tests where they hang sometimes... I can reproduce locally. I think there's an issue shutting down the launch context before LoadComposableNodes has had a chance to make a service request. |
I've discovered a race in launch (reported in #169), with a proposed fix in ros2/launch#449. With the proposed fix, the tests in the PR pass, otherwise we can expect them to hang forever. |
There's no need to run asynchronously and manually send a shutdown signal. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
782776c
to
d57bc8d
Compare
I understand the race, but what is causing it here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me for getting the right values when the action is executed.
Sadly, it still seems like there's a race somewhere causing the tests to hang somteimes 😞 |
Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
Note, I don't expect the PR CI job to pass until ros2/launch#449 is released. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I didn't realize that shutdown wasn't done properly
test_launch_ros/test/test_launch_ros/actions/test_load_composable_nodes.py
Outdated
Show resolved
Hide resolved
@ros-pull-request-builder retest this please |
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org>
…odes (#166) (#206) * Fix race with launch context changes when loading composable nodes (#166) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Fix docblock in LoadComposableNodes (#207) Signed-off-by: Jacob Perron <jacob@openrobotics.org>
) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Backporting note: As far as I can tell, the *Composable* launch_ros classes in Dashing are not setup to be able to be tested. We would need to add a bunch of additional dependencies to write tests here. I didn't think that was worth it for Dashing, so I just removed the tests. Signed-off-by: Jacob Perron <jacob@openrobotics.org> Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
) (#238) * Fix race with launch context changes when loading composable nodes This bug was discovered when trying load composable nodes from a GroupAction. The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished. But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes. Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Add regression tests for LoadComposableNode Signed-off-by: Jacob Perron <jacob@openrobotics.org> * Properly shutdown mock conatiner node Also added some debug logs to the load node action for posterity. Backporting note: As far as I can tell, the *Composable* launch_ros classes in Dashing are not setup to be able to be tested. We would need to add a bunch of additional dependencies to write tests here. I didn't think that was worth it for Dashing, so I just removed the tests. Signed-off-by: Jacob Perron <jacob@openrobotics.org> Signed-off-by: Chris Lalancette <clalancette@openrobotics.org> Co-authored-by: Jacob Perron <jacob@openrobotics.org>
This bug was discovered when trying load composable nodes from a GroupAction.
The ROS namespace (and presumably other remaps) pushed onto the context stack was popped after the LoadComposableNodes execute() function finished.
But because the loading happens asynchronously, we need to make sure we get the necessary information from the context before execute() finishes.