-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shim cleanup on exit #1316
base: main
Are you sure you want to change the base?
shim cleanup on exit #1316
Conversation
Sad day. Can logrus even write a message after this point? I assume anything that happens in the shim post this call now has no way of being logged anywhere? Maybe thats ok since its such a small window now? |
It should be able to, since its normally set up with ETW or a logging binary (via a pipe). Local testing works fine too. I figured closing errors could safely be ignored, but I can log them if you think they're worthwhile |
Fixed bug introduced by earlier PR where graceful shim shutdown prevents containerd from deleting the sandbox bundle since the shim is still using it and containerd errors, because another process is using that directory and the files within it. Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Nah, this seems fine. |
// alive. | ||
// Change the directory to the parent and close stderr (panic.log) to | ||
// allow bundle deletion to succeed. | ||
if err := os.Chdir(".."); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something in me is screaming that we shouldn't do this, but I don't know why.
8101d00
to
b9508da
Compare
Rather than changing the directory, have the ttrpc wait until cleanup is done Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Fixed bug introduced by PR #1289 where graceful shim shutdown prevents containerd from deleting the sandbox bundle since
the shim is still using it and containerd errors, because anotherprocess is using that directory and the files within it.
containerd waits for the shim to close its ttrpc server before marking it as closed and cleaning up resources, such as the sandbox bundle. However, now the shim shuts down the ttrpc server and then exits gracefully, rather than executing an
os.Exit(0)
, causing a timing error, where the shim is still alive, with the sandbox bundle as its working directory and thepanic.log
file within the bundle as itsstderr
, and contaienrd is attempting to delete the sandbox bundle directory that another process (the shim) is using.The PR changes the working directory and closes the
stderr
of the shim during exit, before the ttrpc server is closed, so the bundle can be deleted while the shim finishes shutting down.Signed-off-by: Hamza El-Saawy hamzaelsaawy@microsoft.com