k8s: fix wrong number cpus after killing a container #2251

devimc · 2019-11-22T18:12:10Z

Don't hot add again non-running container resources to avoid having extra
and useless resources

NOTE: this PR won't break backward compatibility

devimc · 2019-11-22T18:12:31Z

/test

codecov · 2019-11-22T18:55:08Z

Codecov Report

❗ No coverage uploaded for pull request base (master@dd15db3). Click here to learn what that means.
The diff coverage is 50%.

@@            Coverage Diff            @@
##             master    #2251   +/-   ##
=========================================
  Coverage          ?   48.85%           
=========================================
  Files             ?      111           
  Lines             ?    16135           
  Branches          ?        0           
=========================================
  Hits              ?     7882           
  Misses            ?     7274           
  Partials          ?      979

Run the test that checks the number of cpus after killing a container Depends-on: github.com/kata-containers/runtime#2251 fixes kata-containers#2116 Signed-off-by: Julio Montes <julio.montes@intel.com>

jodh-intel

Thanks @devimc - is there any way you can add a unit test for this?

bergwolf · 2019-11-25T12:39:59Z

virtcontainers/sandbox.go

@@ -1951,6 +1955,12 @@ func (s *Sandbox) updateResources() error {
 func (s *Sandbox) calculateSandboxMemory() int64 {
 	memorySandbox := int64(0)
 	for _, c := range s.config.Containers {
+		// Do not hot add again non-running containers resources
+		if cont, ok := s.containers[c.ID]; ok && cont.state.State == types.StateStopped {


Then you need to recalculate resources in s.StartContainer(), right?

good point!, let me do it and run the tests

lifupan · 2019-11-25T14:29:00Z

Hi @devimc, I have a question about this issue the PR fixed: once the container process was killed, the contaienrd/cri would get a task exit event and would try to delete the container from https://github.com/containerd/cri/blob/master/pkg/server/events.go#L320, which would trigger kata shimv2 to delete the container from the sandbox. But from your fix, it seem that the killed container was still existed in the sandbox, Im puzzled with why the stopped container wasn't deleted from sandbox or am I missed some thing?

devimc · 2019-11-25T16:23:50Z

@jodh-intel there is already an integration test - https://github.com/kata-containers/tests/blob/master/integration/kubernetes/k8s-number-cpus.bats

@lifupan yes, you are right, the container status is stopped but delete is never called, instead k8s creates a new container. Something weird, if this new container is killed, its status change to stopped and delete is called, I guess because k8s don't want to delete the original containers? 🤔

Status of container should know prior to calculate the number of CPU and memory Signed-off-by: Julio Montes <julio.montes@intel.com>

Don't hot add again non-running container resources to avoid having extra and useless resources fixes kata-containers#2186 Signed-off-by: Julio Montes <julio.montes@intel.com>

devimc · 2019-11-25T18:43:14Z

@bergwolf change applied, thanks

/test

lifupan · 2019-11-26T00:49:11Z

@lifupan yes, you are right, the container status is stopped but delete is never called, instead k8s creates a new container. Something weird, if this new container is killed, its status change to stopped and delete is called, I guess because k8s don't want to delete the original containers? 🤔

Actually k8s didn't involved into the stopped container's keep/delete in runtime, it's containerd/cri's responsibility. As you said it's weird, maybe there would be another bug embedded somewhere.

devimc · 2019-11-26T18:00:26Z

can we merge this?

amshinde · 2019-11-26T18:12:29Z

Did you test this with crio or containerd? Do you see the same behaviour with both? Would be good to track down any cri error.

devimc · 2019-11-26T19:26:42Z

@amshinde no, I just tested with crio

amshinde · 2019-12-02T23:28:25Z

@devimc Looks like a candidate for backporting. Can you open a PR to do so.

devimc · 2019-12-03T15:46:39Z

@amshinde done #2313

GabyCT approved these changes Nov 22, 2019

View reviewed changes

devimc mentioned this pull request Nov 22, 2019

integration/kubernetes: unskip k8s-number-cpus test kata-containers/tests#2116

Closed

devimc mentioned this pull request Nov 22, 2019

integration/kubernetes: unskip k8s-number-cpus test kata-containers/tests#2117

Merged

jodh-intel approved these changes Nov 25, 2019

View reviewed changes

bergwolf reviewed Nov 25, 2019

View reviewed changes

devimc added the do-not-merge PR has problems or depends on another label Nov 25, 2019

Julio Montes added 2 commits November 25, 2019 18:42

virtcontainers: update resources after adding container to sandbox

43f0513

Status of container should know prior to calculate the number of CPU and memory Signed-off-by: Julio Montes <julio.montes@intel.com>

virtcontainers: don't consider non-running container resources

b7731e9

Don't hot add again non-running container resources to avoid having extra and useless resources fixes kata-containers#2186 Signed-off-by: Julio Montes <julio.montes@intel.com>

devimc force-pushed the topic/k8s/fixWrongNumberCPUs branch from dbfae48 to b7731e9 Compare November 25, 2019 18:42

devimc removed the do-not-merge PR has problems or depends on another label Nov 25, 2019

amshinde merged commit d054556 into kata-containers:master Nov 26, 2019

amshinde added the backport label Dec 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s: fix wrong number cpus after killing a container #2251

k8s: fix wrong number cpus after killing a container #2251

devimc commented Nov 22, 2019

devimc commented Nov 22, 2019

codecov bot commented Nov 22, 2019 •

edited

Loading

jodh-intel left a comment

bergwolf Nov 25, 2019

devimc Nov 25, 2019

lifupan commented Nov 25, 2019

devimc commented Nov 25, 2019

devimc commented Nov 25, 2019

lifupan commented Nov 26, 2019

devimc commented Nov 26, 2019

amshinde commented Nov 26, 2019

devimc commented Nov 26, 2019

amshinde commented Dec 2, 2019

devimc commented Dec 3, 2019

k8s: fix wrong number cpus after killing a container #2251

k8s: fix wrong number cpus after killing a container #2251

Conversation

devimc commented Nov 22, 2019

devimc commented Nov 22, 2019

codecov bot commented Nov 22, 2019 • edited Loading

Codecov Report

jodh-intel left a comment

Choose a reason for hiding this comment

bergwolf Nov 25, 2019

Choose a reason for hiding this comment

devimc Nov 25, 2019

Choose a reason for hiding this comment

lifupan commented Nov 25, 2019

devimc commented Nov 25, 2019

devimc commented Nov 25, 2019

lifupan commented Nov 26, 2019

devimc commented Nov 26, 2019

amshinde commented Nov 26, 2019

devimc commented Nov 26, 2019

amshinde commented Dec 2, 2019

devimc commented Dec 3, 2019

codecov bot commented Nov 22, 2019 •

edited

Loading