-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to abort VMM when parent process dies #4172
base: main
Are you sure you want to change the base?
Conversation
Hi @jseba, Thanks for the contribution. I have some questions to understand a bit more the context: You send |
I used SIGUSR2 as the signal that firecracker gets from the OS on its parent's death in case there is anything that needs to be done differently on this path (e.g. logging that it's exiting due to the parent exiting) compared to the SIGINT handler |
Thanks for the explanation @jseba.
Just to clarify, the signal does not arrive from the OS, IOW it is not the kernel that sends the signal, it is some monitoring process (I guess), right?
I don't think that reserving a signal number is worth it for the sole purpose of logging the reason why Firecracker was shutdown. Is there any other cleanup/actions you have in mind? |
No, the signal comes from the kernel directly, no other management process is involved. That's why I selected a different signal to differentiate it since otherwise it is indistinguishable from the parent process interrupting it.
The log message was the only thing I could think of off hand behavior wise, but my thought process was that if the parent process is still alive and firecracker gets SIGTERM, even from another process, the parent process will still be running to get the result from |
If Firecracker is being monitored by a parent process that unexpectedly terminates, it will be abandoned up the process tree, likely to a process that doesn't know what do with it (such as init). This becomes even trickier if the process was running in a mount namespace that was controlled by the parent process, as the API socket is now inaccessible. If the parent process was also keeping handles on other resources used by the Firecracker VMM, these could be re-used by new processes and cause conflicts with the now orphaned Firecracker. This adds a flag to set the parent death signal (SIGUSR2 in this instance) that the process will receive when its parent process exits before the VMM does. Receipt of this signal will cause the VMM to abruptly abort, much like the SIGILL signal. While a graceful shutdown would be preferable, since the parent process may have been controlling outside resources for Firecracker (disks, networking, etc.), it's indeterminate whether or not it is safe to continue running the VM. Signed-off-by: Josh Seba <jseba@cloudflare.com>
c1164ce
to
bf00b57
Compare
I fixed the merge conflict and rebased back onto the tip of main |
So, you are saying that when a process dies, Linux sends by itself SIGUSR2 to all its children processes? I was not aware of this mechanism. Could you share a link that documents that?
Like what? What is the worst thing that can happen? And why can this happen without Firecracker's involvement? |
Linux by default sends nothing. The PR_SET_PDEATHSIG option tells it to send the designated signal to the calling process when the calling process's parent dies.
The most obvious causes would be either a crash or an OOM kill from the kernel. Either way, the process terminates immediately with no chance for graceful cleanup. In our case, one of things being managed is the network namespace, device and address used for the Firecracker VM. If the parent process dies, the networking configuration used for the VM will be released to be used by another VM while the now-abandoned one continues running. This means another VM can be started on the host with the network configuration that the VM was using, meaning the abandoned VM will no longer have network connectivity. It is not safe to assume that simply terminating the VM gracefully will work in a timely manner, as services running in the VM could attempt to flush data and hang waiting for responses. |
Thanks, I didn't know about this mechanism :)
The graceful clean-up in this case just logging about the fact we are shutting down due to the parent process dying, other than that it's completely equivalent to just sending a SIGTERM to Firecracker.
That is true, but my point is that I don't think we need to change Firecracker itself to achieve this. In fact, I don't think we should change Firecracker at all since there's nothing that the Firecracker process needs to do other than shutting down. What if, for example, you called |
Changes
This adds a flag to set the parent death signal (SIGUSR2 in this instance) that the process will receive when its parent process exits before the VMM does. Receipt of this signal will cause the VMM to abruptly abort, much like the SIGILL signal. While a graceful shutdown would be preferable, since the parent process may have been controlling outside resources for Firecracker (disks, networking, etc.), it's indeterminate whether or not it is safe to continue running the VM.
Reason
If Firecracker is being monitored by a parent process that unexpectedly terminates, it will be abandoned up the process tree, likely to a process that doesn't know what do with it (such as init). This becomes even trickier if the process was running in a mount namespace that was controlled by the parent process, as the API socket is now inaccessible.
If the parent process was also keeping handles on other resources used by the Firecracker VMM, these could be re-used by new processes and cause conflicts with the now orphaned Firecracker.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
CHANGELOG.md
.TODO
s link to an issue.rust-vmm
.