Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to quit ros2 launch via SIGINT #495

Closed
felixdivo opened this issue Mar 27, 2021 · 8 comments
Closed

Failure to quit ros2 launch via SIGINT #495

felixdivo opened this issue Mar 27, 2021 · 8 comments
Labels
help wanted Extra attention is needed

Comments

@felixdivo
Copy link

felixdivo commented Mar 27, 2021

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • binary: ros-foxy-desktop
  • Version or commit hash:
    • foxy, most recent
  • DDS implementation:
    • default
  • Client library (if applicable):
    • Python code, but no client library is used directly

Steps to reproduce issue

  1. Create simple launch file (any seems to do, but this one will work right away if the examples are installed):
#!/usr/bin/env python3

# Node launching
from launch import LaunchDescription
from launch_ros.actions import Node


def generate_launch_description() -> LaunchDescription:
    return LaunchDescription(
        [
            Node(
                package="examples_rclpy_minimal_publisher",
                executable="publisher_local_function",
                name="minimal_node",
            ),
        ]
    )
  1. Create a simple script to start the launch file problematically. I need this for automation of a test suite.
from signal import SIGINT
from subprocess import Popen
from time import sleep

# Do not set `stdout=PIPE, stderr=STDOUT`, instead inherit from parent
launch_file = "example_launch_common.py"
process = Popen(["ros2", "launch", launch_file, "--debug"], text=True)
# Give the launch process time to start up
sleep(3.0)

# Signal the child launch process to finish; like Ctrl+C
process.send_signal(SIGINT)
# Might raise a TimeoutExpired if it takes too long
return_code = process.wait(timeout=10)
print(f"return_code: {return_code}")

Expected behavior

The node is gracefully shut down (in probably way less than a second after issuing the SIGINT) and the exit code is zero.

Actual behavior

The node fails to exit and must be killed, leaving an orphan process and taking quite some time:

[ERROR] [publisher_local_function-1]: process[publisher_local_function-1] failed to terminate '5' seconds after receiving 'SIGINT', escalating to 'SIGTERM'

Also, why is the exit code zero in this case? I could not find any documentation on the exit codes.

Additional information

Using the launch Python API causes the same problem as the Popen approach above.

When calling the launch file via ros2 launch example_launch_common.py -d from the bash console and exiting via Ctrl+C, everything works fine. This means that in such a case, the node reacts to the shutdown signal, it shuts down immediately and the exit code of the launch command is zero.

@hidmic
Copy link
Contributor

hidmic commented Mar 30, 2021

This is equivalent to #473. It was addressed in Rolling by allowing non-interactive ros2 launch execution (see #475 and ros2/launch_ros#210). It has not been backported to Foxy though (as usual for new features). I'll check with Foxy ROS Boss.

In the meantime, sending SIGINT to the process group should do.

@hidmic hidmic added the help wanted Extra attention is needed label Mar 30, 2021
@felixdivo
Copy link
Author

This problem was already fixed by the PRs mentioned above. A backport to foxy was briefly discussed in #475 is now proposed in #500 and in ros2/launch_ros#225.

@felixdivo
Copy link
Author

Thanks @hidmic!

@felixdivo
Copy link
Author

Sadly, this does not seem to be fixed by the backport. I have now installed version 0.11.2-1focal.20210423.012808 of ros-foxy-launch-ros.

Adding "--noninteractive" to above Popen() call still does not change resolve the problem. The execution now results in:

[DEBUG] [launch.launch_context]: emitting event synchronously: 'launch.events.IncludeLaunchDescription'
[INFO] [launch]: All log files can be found below [...]
[INFO] [launch]: Default logging verbosity is set to DEBUG
[DEBUG] [launch]: processing event: '<launch.events.include_launch_description.IncludeLaunchDescription object at 0x7f5bc5d899d0>'
[DEBUG] [launch]: processing event: '<launch.events.include_launch_description.IncludeLaunchDescription object at 0x7f5bc5d899d0>' ✓ '<launch.event_handlers.on_include_launch_description.OnIncludeLaunchDescription object at 0x7f5bc5d65100>'
[WARNING] [launch_ros.actions.node]: Parameter file path is not a file: parameters.yaml
Executing <Task finished name='Task-2' coro=<LaunchService._process_one_event() done, defined at /opt/ros/foxy/lib/python3.8/site-packages/launch/launch_service.py:274> result=None created at /opt/ros/foxy/lib/python3.8/site-packages/launch/launch_service.py:365> took 0.151 seconds
[INFO] [turtlesim_node-1]: process started with pid [30076]
[DEBUG] [launch.launch_context]: emitting event: 'launch.events.process.ProcessStarted'
[DEBUG] [launch]: processing event: '<launch.events.process.process_started.ProcessStarted object at 0x7f5bc5103bb0>'
[DEBUG] [launch.launch_context]: emitting event synchronously: 'launch.events.process.ProcessStderr'
[DEBUG] [launch]: processing event: '<launch.events.process.process_stderr.ProcessStderr object at 0x7f5bc5d89ca0>'
[DEBUG] [launch]: processing event: '<launch.events.process.process_stderr.ProcessStderr object at 0x7f5bc5d89ca0>' ✓ '<launch.event_handlers.on_process_io.OnProcessIO object at 0x7f5bc5103a00>'
[turtlesim_node-1] [INFO] [1620142058.595883133] [turtlesim_node]: Starting turtlesim with node name /turtlesim_node
[DEBUG] [launch.launch_context]: emitting event synchronously: 'launch.events.process.ProcessStderr'
[DEBUG] [launch]: processing event: '<launch.events.process.process_stderr.ProcessStderr object at 0x7f5bc5d65310>'
[DEBUG] [launch]: processing event: '<launch.events.process.process_stderr.ProcessStderr object at 0x7f5bc5d65310>' ✓ '<launch.event_handlers.on_process_io.OnProcessIO object at 0x7f5bc5103a00>'
[turtlesim_node-1] [INFO] [1620142058.613033429] [turtlesim_node]: Spawning turtle [turtle1] at x=[5,544445], y=[5,544445], theta=[0,000000]
[WARNING] [launch]: user interrupted with ctrl-c (SIGINT)
[DEBUG] [launch.launch_context]: emitting event synchronously: 'launch.events.Shutdown'
Traceback (most recent call last):
  File "scratch_github.py", line 14, in <module>
    return_code = process.wait(timeout=10)
  File "/usr/lib/python3.8/subprocess.py", line 1079, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.8/subprocess.py", line 1796, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['ros2', 'launch', '[...]/example_launch.py', '--noninteractive', '--debug']' timed out after 10 seconds

In words, SIGINT still does not quit the launch process.

However, specifically turtlesim opens a GUI window, and I can close it afterwards. Then, the following gets printed to the console (since the process' STDOUT is still attached to the terminal):

[DEBUG] [launch]: processing event: '<launch.events.shutdown.Shutdown object at 0x7f5bc518f280>'
[DEBUG] [launch]: processing event: '<launch.events.shutdown.Shutdown object at 0x7f5bc518f280>' ✓ '<launch.event_handlers.on_shutdown.OnShutdown object at 0x7f5bc5103a60>'
[DEBUG] [launch.launch_context]: emitting event synchronously: 'launch.events.process.SignalProcess'
[DEBUG] [launch]: processing event: '<launch.events.shutdown.Shutdown object at 0x7f5bc518f280>' ✓ '<launch.event_handlers.on_shutdown.OnShutdown object at 0x7f5bc5d6cf70>'
[DEBUG] [launch]: processing event: '<launch.events.process.signal_process.SignalProcess object at 0x7f5bc518f880>'
[DEBUG] [launch]: processing event: '<launch.events.process.signal_process.SignalProcess object at 0x7f5bc518f880>' ✓ '<launch.event_handler.EventHandler object at 0x7f5bc51039a0>'
[INFO] [turtlesim_node-1]: sending signal 'SIGINT' to process[turtlesim_node-1]
[INFO] [turtlesim_node-1]: process has finished cleanly [pid 30076]
[DEBUG] [launch.launch_context]: emitting event: 'launch.events.process.ProcessExited'
[DEBUG] [launch]: processing event: '<launch.events.process.process_exited.ProcessExited object at 0x7f5bc5103e20>'
[DEBUG] [launch]: processing event: '<launch.events.process.process_exited.ProcessExited object at 0x7f5bc5103e20>' ✓ '<launch.event_handlers.on_process_exit.OnProcessExit object at 0x7f5bc5103be0>'
[DEBUG] [launch]: processing event: '<launch.events.process.process_exited.ProcessExited object at 0x7f5bc5103e20>' ✓ '<launch.event_handlers.on_process_exit.OnProcessExit object at 0x7f5bc5103a90>'

Any ideas on what's happening here?

@felixdivo felixdivo reopened this May 4, 2021
@hidmic
Copy link
Contributor

hidmic commented May 4, 2021

It appears to be hanging when emitting an event from within the signal handler. Might be related to #476.

@felixdivo
Copy link
Author

This would be fixed by #511

@hidmic
Copy link
Contributor

hidmic commented Oct 27, 2021

I believe this has been fixed since. Closing.

@hidmic hidmic closed this as completed Oct 27, 2021
@felixdivo
Copy link
Author

@hidmic Indeed: It works locally on my setup and also in some different containers in with some other code, so it seems to be fixed once and for all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants