-
Notifications
You must be signed in to change notification settings - Fork 374
Conversation
Nice catch! |
PSS Measurement: Memory inside container: |
0edb191
to
63ec9f7
Compare
virtcontainers/sandbox.go
Outdated
@@ -69,6 +69,9 @@ type State struct { | |||
// Pid is the process id of the sandbox container which is the first | |||
// container to be started. | |||
Pid int `json:"pid"` | |||
|
|||
// FreeCpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand this comment a little please?
virtcontainers/sandbox.go
Outdated
@@ -1201,6 +1207,42 @@ func (s *Sandbox) CreateContainer(contConfig ContainerConfig) (VCContainer, erro | |||
return c, nil | |||
} | |||
|
|||
func (s *Sandbox) creditContainerVCPU(num uint32, credit bool) (uint32, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function could do with an explanatory comment (which should explain what credit
is and what the returned int value represents).
I also find the name a little confusing since:
- it's not working on a particular container - it's working at the pod level.
- the name doesn't really fully explain what it's doing. Maybe something like
adjustVCPUCount()
?
Lastly, please can you add a unit-test for the calculation code (maybe by putting that into a new function to make it easier to test).
virtcontainers/sandbox.go
Outdated
@@ -54,21 +54,24 @@ const ( | |||
type State struct { | |||
State stateString `json:"state"` | |||
|
|||
// FreeCpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand this comment a little please?
virtcontainers/sandbox.go
Outdated
@@ -1201,6 +1207,42 @@ func (s *Sandbox) CreateContainer(contConfig ContainerConfig) (VCContainer, erro | |||
return c, nil | |||
} | |||
|
|||
func (s *Sandbox) creditContainerVCPU(num uint32, credit bool) (uint32, error) { | |||
if credit { | |||
if s.state.FreeStaticCPU == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// No initial vCPUs so no adjustment necessary.
} | ||
|
||
if num <= s.state.FreeStaticCPU { | ||
s.state.FreeStaticCPU -= num |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is changing the state, but if the subsequent call to c.sandbox.hypervisor.hotplug[Add|Remove]Device()
fails, won't the state then be incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf Same question. I think you would need to call this function again with reverse credit flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state is only saved on disk when hotplug succeeds.
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
PSS Measurement: Memory inside container: |
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
Build failed (third-party-check pipeline) integration testing with
|
FreeStaticCPU uint32 `json:"freeStaticCpu,omitempty"` | ||
|
||
// Bool to indicate if the drive for a container was hotplugged. | ||
// This is moved to bottom of the struct to pass maligned check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we need to document this in the code. Maybe just add it in the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just moving the lines to pass CI memory alignment check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that line of comment so that in future someone wants to modify the struct, they know that the Bool
one is better left at the bottom otherwise maligned check might fail.
} | ||
|
||
if num <= s.state.FreeStaticCPU { | ||
s.state.FreeStaticCPU -= num |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf Same question. I think you would need to call this function again with reverse credit flag?
We should take the initial cpu count when setting cpu constraints. IOW, if there is already enough cpu in the vm for a container, we should not hotplug more CPUs. Depends-on: kata-containers/runtime#696 Fixes: kata-containers#705 Signed-off-by: Peng Tao <bergwolf@gmail.com>
@@ -1722,3 +1722,29 @@ func TestGetNetNs(t *testing.T) { | |||
netNs = s.GetNetNs() | |||
assert.Equal(t, netNs, expected) | |||
} | |||
|
|||
func TestAdjustVCPUCount(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. However, there are quite a few missing scenarios for this test (to handle all the if
tests in adjustVCPUCount()
).
yes, you see two CPUs but the container can use only 1 cpu since it's placed inside a CPU cgroup have you considered the k8s case where one container has a CPU constraint and other doesn't? currently we are leaving 1 vcpu for containers without CPU constraints and non-container processes, like systemd and kata-agent, I'd like to see how much will impact this change to HPC containers |
The cpu hotplug calculation should take into account the initial vCPUs counts otherwise we end up hotplugging more vCPUs than actual need. Fixes: kata-containers#695 Signed-off-by: Peng Tao <bergwolf@gmail.com>
No, if container B doesn't have CPU constraints, it consumes all the vCPUs of a guest. So in your example, assuming
No, we are not leaving just 1 vcpu. We are leaving |
Build failed (third-party-check pipeline) integration testing with
|
Actually, that's not true I ran next test: # ~/cpu-test.yaml
apiVersion: v1
kind: Pod
metadata:
name: cpu-demo
spec:
containers:
- name: cpu0
image: vish/stress
resources:
limits:
cpu: "3"
args:
- -cpus
- "3"
- name: cpu1
image: vish/stress
args:
- -cpus
- "3" Container sudo -E kubectl create -f ~/cpu-test.yaml
using socat and a debuggable image with
Container
|
PSS Measurement: Memory inside container: |
@devimc Could you check cpuset for the two container processes, as well as kata-agent? I'm traveling and do not have access to k8s cluster right now. But AFAICT, for a container created with:
Both OTOH, CPU cgroup quota does not guarantee the CPU share of the quota. It simply limit the ceiling CPU time you can use.
So in your example, container I'm not sure what kind of semantics we want to give to users. Maybe we should add an option to control if the initial CPU/memory should be assigned to containers with constraints. |
@devimc What do you think of the idea that uses a config option to specify if the initial vCPU and memory settings in the config are shared with containers with cpu/memory constraints? |
@bergwolf cold plugged resources (mem and cpu) can be used or taking into account by containers with or without constraints , sounds good, but we should document the possible impacts. |
@sboeuf I'll add a new cli config option to control the behavior. |
@bergwolf sounds good, thanks for the feedback. |
client.go: HybridVSockDialer: Change Read EOT to recv peek
The cpu hotplug calculation should take into account the initial vCPUs
counts otherwise we end up hotplugging more vCPUs than actual need.
Before fix:
After fix: