-
Notifications
You must be signed in to change notification settings - Fork 374
k8s: fix wrong number cpus after killing a container #2251
k8s: fix wrong number cpus after killing a container #2251
Conversation
/test |
Codecov Report
@@ Coverage Diff @@
## master #2251 +/- ##
=========================================
Coverage ? 48.85%
=========================================
Files ? 111
Lines ? 16135
Branches ? 0
=========================================
Hits ? 7882
Misses ? 7274
Partials ? 979 |
Run the test that checks the number of cpus after killing a container Depends-on: github.com/kata-containers/runtime#2251 fixes kata-containers#2116 Signed-off-by: Julio Montes <julio.montes@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @devimc - is there any way you can add a unit test for this?
@@ -1951,6 +1955,12 @@ func (s *Sandbox) updateResources() error { | |||
func (s *Sandbox) calculateSandboxMemory() int64 { | |||
memorySandbox := int64(0) | |||
for _, c := range s.config.Containers { | |||
// Do not hot add again non-running containers resources | |||
if cont, ok := s.containers[c.ID]; ok && cont.state.State == types.StateStopped { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you need to recalculate resources in s.StartContainer()
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!, let me do it and run the tests
Hi @devimc, I have a question about this issue the PR fixed: once the container process was killed, the contaienrd/cri would get a task exit event and would try to delete the container from https://github.com/containerd/cri/blob/master/pkg/server/events.go#L320, which would trigger kata shimv2 to delete the container from the sandbox. But from your fix, it seem that the killed container was still existed in the sandbox, Im puzzled with why the stopped container wasn't deleted from sandbox or am I missed some thing? |
@jodh-intel there is already an integration test - https://github.com/kata-containers/tests/blob/master/integration/kubernetes/k8s-number-cpus.bats @lifupan yes, you are right, the container status is |
Status of container should know prior to calculate the number of CPU and memory Signed-off-by: Julio Montes <julio.montes@intel.com>
Don't hot add again non-running container resources to avoid having extra and useless resources fixes kata-containers#2186 Signed-off-by: Julio Montes <julio.montes@intel.com>
dbfae48
to
b7731e9
Compare
@bergwolf change applied, thanks /test |
Actually k8s didn't involved into the stopped container's keep/delete in runtime, it's containerd/cri's responsibility. As you said it's weird, maybe there would be another bug embedded somewhere. |
can we merge this? |
Did you test this with crio or containerd? Do you see the same behaviour with both? Would be good to track down any cri error. |
@amshinde no, I just tested with crio |
@devimc Looks like a candidate for backporting. Can you open a PR to do so. |
Don't hot add again non-running container resources to avoid having extra
and useless resources
fixes #2186
NOTE: this PR won't break backward compatibility