-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS Batch job control #308
Conversation
e3888cd
to
abf7768
Compare
Repushed with some minor refinements re: the clarity of code around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the new --cancel
flag, that will be really useful for stopping trial builds 🌟
I'm a little unsure about the expected use of the --detach-on-interrupt
option in the context of GitHub Action managed builds. This is definitely due to my lack of understanding of signals and how signals would propagate through GitHub Actions. (Still processing this GitHub community forum and this demo of GitHub actions signal handling.)
Can definitely test this out myself later, but wanted to leave these questions:
- Would
--detach-on-interrupt
be used to chain the jobs to work around the 6hr job limit? - Does canceling a GitHub Action workflow also cancel the attached AWS Batch job?
Yep. See the relevant bit of the WIP (though I've set that down for now given other priorities).
Yes, without In the WIP, I handle workflow run cancellation explicitly and indeed make sure it cancels the AWS Batch job.
\o/ |
The signal.signal() repetition grated, particularly as I found myself back in this code to add more such calls.
…t execeptions This leads to a cleaner main loop and makes it much easier to add an option for detaching on interrupt instead of stopping. Diff best viewed with whitespace ignored and/or moved lines coloring, as much of the diff is shifting code around.
… TTY If stdin is not a TTY, then it's unlikely (though not impossible) to receive SIGINT due to actual keyboard input (e.g. Ctrl-C) and more likely to be programmatically signaled. Confirmation is an unusual hindrance in that case.
Its sometimes useful to treat Ctrl-C/SIGINT as non-terminating for the remote job, for example in automation contexts (e.g. GitHub Actions) where SIGINT may be sent as part of the automation service's job control system or even merely as a molly-guard to prevent accidental cancellation when attaching to a build to only observe it.
SIGHUP happens, for example, if the user closes a terminal (or the terminal dies on its own) while our process is still running. It can happen in other cases too. The remote job stays running whether we handle SIGHUP or not, but it's nice to detach cleanly and be explicit about our expectations.
This is friendlier to programmatic usage than attaching in the background and sending a SIGINT. It's also bound to be handy for direct usage by folks.
abf7768
to
d2375d1
Compare
Repushed to add suggested clarification to option help and resolve conflicts in the changelog by rebasing onto latest |
Yeah, it's kind of a mess. Signal use/propagation in GitHub Actions is lightly documented but not comprehensively. We can do our own propagation if necessary, e.g. to signal the whole process group instead of just the leader. I did this in previous WIP prototypes.
Realizing this "yes" might still be modulated by signal propagation. |
Though to be fair to GitHub Actions, signalling cancellation in any system can be messy, esp. when there's support for graceful cancellation, and hard to simplify. |
See commit messages for details, and also the new changelog entries.
This work is motivated by deeper integration for GitHub Actions workflows that launch builds as AWS Batch jobs.
Checklist