-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Stop Debugging" variant for terminating entire process tree #230850
Comments
one challenge could be "how do you kill a process tree" / "is it sufficient to just kill the parent". on POSIX systems (macOS, Linux, etc) I think this is just:
on Win32 systems emulating POSIX semantics via Cygwin/MSYS2, killing process trees gently (i.e. emulating POSIX's SIGTERM semantics) is a bit different… I took some notes on this a while ago (.pptx, .pdf):
and found these to be useful references regarding "how do you emulate SIGTERM on Windows": https://stackoverflow.com/questions/48199794/winpty-and-git-bash but maybe I'm overthinking this, or maybe a simple solution can be developed for POSIX systems before confronting "what to do on Windows". you also have access to the terminal, which probably emulates these semantics already; could initiate a termination from the terminal then detach the debugger from all processes. |
This feature request is now a candidate for our backlog. The community has 60 days to upvote the issue. If it receives 20 upvotes we will move it to our backlog. If not, we will close it. To learn more about how we handle feature requests, please see our documentation. Happy Coding! |
The Python debugger can also set |
thanks for that clue; looks like there was previously an effort to implement that but seems issue 1320 didn't get resolved. |
I'd really like for "Stop Debugging" (or a variant thereof) to terminate the entire process tree.
Debugging multi-process applications is common in Python / pytorch, but to "Stop" my application I need to press Stop per subprocess.
I have to press stop over 10 times to kill a realistic program — because each time I press Stop, it then takes me to the next subprocess, and we have subprocesses for dataloader workers, per-GPU model workers, model compilation workers… and it moves my focus every time. I get further and further from the line of code or problem that made me want to Stop the program, and by the end of the carousel my focus is left inside the torchrun wrapper script, which is never where I'm doing development.
the consequence right now is that I avoid as far as possible running my application in a realistic way. I reduce processes until it's down to just 3, so that I only have to press stop 3 times to terminate the program.
pressing "Restart" basically doesn't work, because a new process cannot be started until all processes from the current run have been killed. so for my most common task "run until I hit a problem, hit restart": I have to hit Stop 3 times then Run, when what I really want is to just hit restart once.
possible solutions:
I think I'd want gentle termination of the subprocesses, but to disconnect the debugger so I don't have to watch the shutdown procedure and all the exceptions that get raised along the way.
thanks for any consideration you can give this!
The text was updated successfully, but these errors were encountered: