-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store: serialize container deletion #1722
store: serialize container deletion #1722
Conversation
EnsureRemoveAll already implements the trivial rm -rf attempt first, so there is no need to try it before calling EnsureRemoveAll. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
serialize the call to containersStore.Delete(id) since it attempts to recursively remove the graphroot directory. Doing so, it makes ineffective the other goroutine that attempts to remove safely the graphroot with EnsureRemoveAll. Depending on what goroutine is faster, there can be a flake like: 2023-09-26T17:49:02.6708666Z stderr: Error: cleaning up storage: removing container 6ebff2c6c6f5fe78c158956a88467ef7af6f6a7c3d40334d248c7b7409341230 root filesystem: 1 error occurred: * unlinkat /var/tmp/podman_test3482607530/root/overlay-containers/6ebff2c6c6f5fe78c158956a88467ef7af6f6a7c3d40334d248c7b7409341230/userdata/shm: device or resource busy Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@edsantiago this could be the reason for the ".*/userdata/shm: device or resource busy" flake we see occasionally @vrothberg PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM
Could you open a test-PR against Podman?
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
sure, opened here: containers/podman#20163 |
I have a feeling you will make @edsantiago very very very happy with this fix. The flakes haunted us for so long. |
LGTM |
OMG thank you so much @giuseppe! |
serialize the call to containersStore.Delete(id) since it attempts to recursively remove the graphroot directory.
Doing so, it makes ineffective the other goroutine that attempts to remove safely the graphroot with EnsureRemoveAll. Depending on what goroutine is faster, there can be a flake like:
2023-09-26T17:49:02.6708666Z stderr: Error: cleaning up storage: removing container 6ebff2c6c6f5fe78c158956a88467ef7af6f6a7c3d40334d248c7b7409341230 root filesystem: 1 error occurred: * unlinkat /var/tmp/podman_test3482607530/root/overlay-containers/6ebff2c6c6f5fe78c158956a88467ef7af6f6a7c3d40334d248c7b7409341230/userdata/shm: device or resource busy
it is a difficult to trigger condition, but I am hitting it constantly in: containers/crun#1312
Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com