Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buggy signal handler impl in CommandManager and SubprocessManager #2038

Open
filipcacky opened this issue Sep 17, 2024 · 0 comments
Open

Buggy signal handler impl in CommandManager and SubprocessManager #2038

filipcacky opened this issue Sep 17, 2024 · 0 comments

Comments

@filipcacky
Copy link
Contributor

filipcacky commented Sep 17, 2024

When a CommandManager is created, it sets the interrupt signal handler to self._handle_sigint here, which replaces any previously set handler (see docs).

When running a command synchronously, self._handle_sigint is implemented via asyncio even outside of async context. Sending a kill -INT pid results in

/metaflow/metaflow/runner/subprocess_manager.py:288: RuntimeWarning: coroutine 'CommandManager.kill' was never awaited
  self.cleanup()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

The async function never gets called (except on program exit i think). Also due to blocking impl here, the SubprocessManager.commands doesn't get populated. Due to both of these issues individually, the subprocess doesn't receive any signals. When sending a ctrl-c, the terminal dispatches the signal to all subprocesses (tested on iterm2 with zsh and foot with zsh), this still results in the RuntimeWarning.

In an async context, calling asyncio.create_task without awaiting on it (which is impossible due to a synchronous signal handler), doesn't actually run the task until exit (at least with my testing code). According to asyncio docs, only a signal handler registered via loop.add_signal_handler can interact with the event loop itself. I assume calling asyncio.create_task which internally uses the current event loop is considered an interaction.

Even if that worked, when running multiple commands with async_run_command, each call would replace the previous signal handler, leaving only one active (likely for the last call made).

E: According to posix spec, ctrl-c sends a special INTR character, which Generates a SIGINT signal which is sent to all processes in the foreground process group.

testing scripts:

sleep.py

import time
import signal

def handler(signum, frame):
    print("Signal handler called with signal", signum)
    exit(0)
signal.signal(signal.SIGINT, handler)
signal.signal(signal.SIGTERM, handler)

time.sleep(10)

main.py

import time
import asyncio
from subprocess_manager import SubprocessManager

async def async_main():
    async with SubprocessManager() as spm:
        pid1 = await spm.async_run_command(["python", "./sleep.py"])
        pid2 = await spm.async_run_command(["python", "./sleep.py"])
        pid3 = await spm.async_run_command(["python", "./sleep.py"])

        command_obj1 = spm.get(pid1)
        command_obj2 = spm.get(pid2)
        command_obj3 = spm.get(pid3)

        await command_obj3.wait()
        print(f"time='{time.time()}' p3_pid='{command_obj3.process.pid}'")
        async for pos, line in command_obj3.stream_log("stdout"):
            print(f"p3: {line}")

        await command_obj2.wait()
        print(f"time='{time.time()}' p2_pid='{command_obj2.process.pid}'")
        async for pos, line in command_obj2.stream_log("stdout"):
            print(f"p2: {line}")

        await command_obj1.wait()
        print(f"time='{time.time()}' p1_pid='{command_obj1.process.pid}'")
        async for pos, line in command_obj1.stream_log("stdout"):
            print(f"p1: {line}")

def main():
    spm = SubprocessManager()
    spm.run_command(["python", "./sleep.py"], show_output=True)

if __name__ == "__main__":
    print(f"Starting at {time.time()}")
    asyncio.run(async_main())
    # main()
    print(f"Finished at {time.time()}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant