Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

runtime: Fix /var/lib/vc/sbs/${sid} dir residual #2922

Merged
merged 1 commit into from
Oct 17, 2020

Conversation

keloyang
Copy link
Contributor

@keloyang keloyang commented Sep 1, 2020

Fixes: #2921

runtime call fetchSandbox-->loadSandboxConfigFromOldStore-->store.NewVCSandboxStore--> ... f.initialize --> os.MkdirAll(f.path, DirMode)
and at last create /var/lib/vc/sbs/${sid}, but don't delete it before delete sandbox.

Signed-off-by: Shukui Yang keloyangsk@gmail.com

@cmaf
Copy link

cmaf commented Sep 8, 2020

/test-ubuntu

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keloyang, I'm not sure if I understand this fix.
Would you mind to elaborate why it's needed? Please, also add the details to the commit message.

@codecov
Copy link

codecov bot commented Sep 8, 2020

Codecov Report

Merging #2922 into master will decrease coverage by 1.43%.
The diff coverage is 28.57%.

@@            Coverage Diff             @@
##           master    #2922      +/-   ##
==========================================
- Coverage   51.44%   50.00%   -1.44%     
==========================================
  Files         118      118              
  Lines       17428    15559    -1869     
==========================================
- Hits         8966     7781    -1185     
+ Misses       7379     6717     -662     
+ Partials     1083     1061      -22     

@keloyang
Copy link
Contributor Author

keloyang commented Sep 9, 2020

@fidencio I have updated the issue #2921, PTAL, thanks.

@jodh-intel jodh-intel added the port-to-2.0 PRs that need to be ported to kata 2.0-dev branch label Sep 9, 2020
@jodh-intel
Copy link
Contributor

@keloyang - Thanks for raising! A couple of things:

@bergwolf, @sameo - could you tal at this change please?

@fidencio
Copy link
Member

fidencio commented Sep 9, 2020

Let me give it a try on my environment here, thanks for the update @keloyang.

@amshinde
Copy link
Member

amshinde commented Sep 9, 2020

@keloyang Thanks for the fix! LGTM. Could you please add add a unit/integration test verifying this behaviour?

@amshinde amshinde added the needs-forward-port Changes need to be applied to a newer branch / repository label Sep 9, 2020
@keloyang
Copy link
Contributor Author

@amshinde @jodh-intel unit test case added, ptal, thanks.

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to ask @amshinde, @jodh-intel, to not merge this one yet!

I want to try to reproduce this one and will most likely find some time this evening to do so.
There are a few things puzzling me here, which I'm commenting inline.

@@ -837,7 +837,10 @@ func (s *Sandbox) Delete() error {
}

s.agent.cleanup(s)

vcStore, err := store.NewVCSandboxStore(s.ctx, s.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really have to create a new vcStore or should we just use s.store?


vcStore, err := store.NewVCSandboxStore(s.ctx, s.id)
if err == nil {
vcStore.Delete()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely should check for errors here, at least to log them.

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keloyang - just one comment.

assert.NoError(err)

// expect runtimSidPath not exist, if exist, it means this case failed.
if _, err := os.Stat(runtimSidPath); err == nil || os.IsExist(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer to write this as separate tests:

err := os.Stat(runtimSidPath)
assert.Error(err)
assert.True(os.IsNotExist(err))

@jodh-intel jodh-intel added the do-not-merge PR has problems or depends on another label Sep 10, 2020
@jodh-intel
Copy link
Contributor

@fidencio - agreed - added dnm for now.

@keloyang
Copy link
Contributor Author

keloyang commented Sep 11, 2020

Do we really have to create a new vcStore or should we just use s.store?

thanks for your review.

runtime set s.store only when useOldStore(ctx) is true in https://github.com/kata-containers/runtime/blob/master/virtcontainers/sandbox.go#L573, this need loadSandboxConfigFromOldStore return nil, but it return error if old store don't exist. this means everytime when fetchSandbox is called ,it will create a vcStore, but s.store is nil. so we can't use s.store simplely.
loadSandboxConfigFromOldStore is for backford,and old store has been deprecated, so it's better to use loadSandboxConfig before loadSandboxConfigFromOldStore, so I have new change for this, ptal. @fidencio @jodh-intel

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keloyang,

I've made some comments inline, but I must admit I'm failing to reproduce the issues.

What's exactly the version of the runtime you're using that you can reproduce it?

@@ -728,14 +728,14 @@ func fetchSandbox(ctx context.Context, sandboxID string) (sandbox *Sandbox, err

var config SandboxConfig

// Try to load sandbox config from old store at first.
c, ctx, err := loadSandboxConfigFromOldStore(ctx, sandboxID)
// Try to load sandbox config from new store at first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keloyang, what was the reason of inverting the logic here?
I think we still should try to load from the old store at first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause of the residue is loadSandboxConfigFromOldStore is called but old store file is not exist, and if load sandbox from new store successfull, there is no need to call loadSandboxConfigFromOldStore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amshinde, @egernst,

Let me ask some more experienced people for some help.

I think we still should try to load the old store first.

In that very same code path we call createSandbox(), which calls newSandbox() and there depending on whether we use the old store or not we'd call store.NewCVSandboxStore().

My question here is, why we only call that for one case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel, maybe you can help me with this one? ^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amshinde, @jodh-intel, @egernst,
ping about this question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fidencio This patch makes sense to me. We have kept the old store code around for backward compatibility. But by checking for the old store and trying to initialize one and checking for failure while using the old store, we were inadvertently creating the directory structure under /var/lib/vc/sbs/{sandbox_id} and leaving that around after the sandbox is deleted.
It makes sense to me to check for the new store first instead. I tested this patch and works as expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack then!

@@ -837,7 +837,11 @@ func (s *Sandbox) Delete() error {
}

s.agent.cleanup(s)

if useOldStore(s.ctx) && s.store != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keloyang what's the reason of only doing this for an old store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useOldStore return true only loadSandboxConfigFromOldStore return successfully, so I add useOldStore here, but it's not necessary.

@keloyang
Copy link
Contributor Author

@fidencio please show me how do you try to reproduce, I use the master branch, and it's easy to reproduce.

[root@centos1 ~]# ls  /var/lib/vc/sbs/                                               
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa  cd86011f6c0b779a2e3f40b49fa06463757ac1413ba70007e14462b7d0782e24
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa  93fb2abfbe151c8f12e3b2b7f24321479b2425af80acb93f2f4c2263fb679621  cd86011f6c0b779a2e3f40b49fa06463757ac1413ba70007e14462b7d0782e24

@keloyang
Copy link
Contributor Author

ping @jodh-intel @amshinde @fidencio

@jodh-intel
Copy link
Contributor

@keloyang - please check the CI's at the end of your PR. Travis is failing on this one too:

virtcontainers/sandbox.go:840: File is not `gofmt`-ed with `-s` (gofmt)

@fidencio
Copy link
Member

@fidencio please show me how do you try to reproduce, I use the master branch, and it's easy to reproduce.

[root@centos1 ~]# ls  /var/lib/vc/sbs/                                               
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa  cd86011f6c0b779a2e3f40b49fa06463757ac1413ba70007e14462b7d0782e24
[root@centos1 ~]# docker run --rm -ti --runtime untrusted-runtime 018c9d7b792b echo; ls  /var/lib/vc/sbs/

15760f7e07c98e4f4ed9d7aba4076de14475dd71a61be9919b24037c810b37aa  93fb2abfbe151c8f12e3b2b7f24321479b2425af80acb93f2f4c2263fb679621  cd86011f6c0b779a2e3f40b49fa06463757ac1413ba70007e14462b7d0782e24

I was able to reproduce the issue and I raised a comment on a part that I don't have a clear understanding. Let's wait till the other developers jump into that.

Create and delete a kata container everytime, the directory of
/var/lib/vc/sbs/ will have a new directory which's name is the
${sandbox-id}, e.g.
d3e0482b22b9e25cd3268608b12ab8c1eb666960c4fa9a6a72a3e4d0b1606551

Fixes: #2921

Signed-off-by: Shukui Yang <keloyangsk@gmail.com>
@keloyang
Copy link
Contributor Author

virtcontainers/sandbox.go

updated and sorry for not gofmt, thanks @jodh-intel

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keloyang.

lgtm

@jodh-intel
Copy link
Contributor

/test

@jodh-intel jodh-intel removed the do-not-merge PR has problems or depends on another label Sep 22, 2020
@jodh-intel
Copy link
Contributor

/test-ubuntu-qemu-metrics

@jodh-intel
Copy link
Contributor

@keloyang - I've restarted the failing metrics CI to see if this PR now passes. Please can you port this to 2.0 (https://github.com/kata-containers/kata-containers), or add a link to the 2.0 PR here if already done?

@amshinde
Copy link
Member

amshinde commented Oct 2, 2020

I had tested a previous version of this PR, will like to take another look before this is merged.

@amshinde
Copy link
Member

amshinde commented Oct 7, 2020

lgtm @fidencio. I think this one is ready to be merged.

@keloyang
Copy link
Contributor Author

I think there is no need to do porting for 2.0-dev, because there is no code to call loadSandboxConfigFromOldStore in fetchSandbox, see https://github.com/kata-containers/kata-containers/blob/2.0-dev/src/runtime/virtcontainers/sandbox.go#L650.
ping @jodh-intel @amshinde @fidencio

@fidencio
Copy link
Member

/test-centos

@fidencio
Copy link
Member

/test-rhel

@amshinde amshinde merged commit 87d215e into kata-containers:master Oct 17, 2020
@jodh-intel
Copy link
Contributor

@keloyang - Regarding, #2922 (comment), I think you are correct, so I'm removing the needs-forward-port label...

@jodh-intel jodh-intel added needs-backport Changes need to be applied to an older branch / repository no-forward-port-needed Changed do not need to be applied to a newer branch / repository and removed needs-forward-port Changes need to be applied to a newer branch / repository port-to-2.0 PRs that need to be ported to kata 2.0-dev branch labels Nov 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs-backport Changes need to be applied to an older branch / repository no-forward-port-needed Changed do not need to be applied to a newer branch / repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

/var/lib/vc/sbs/${sid} dir residual
5 participants