-
Notifications
You must be signed in to change notification settings - Fork 47
[RFC] proposal of dynamic resource allocation for CoCo #117
Description
Problem statement
- default_vcpus and default_memory are specify statically in kata configuration file configuration.toml, like https://github.com/confidential-containers/kata-containers-CCv0/blob/CCv0/src/runtime/config/configuration-qemu-tdx.toml.in
- Currently TEE does not support CPU and memory hot plug features because of security consideration.
- large POD (e.g. containers with large size models or containers running heavy AI workloads) may need to request more CPUs and memory than default settings by declaring the resource requested and resource limitation in POD yaml file, but this resource definition does not take effect in TEE environment.
Kata boot flow
When creating a POD, it CreateSandbox() first, and default_vcpus and default_memory will be allocated for the VM itself, and then it creates the container, at this moment, more vcpus and memory may be hot plugged and allocated for the container if it is defined in POD yaml. But this resource allocation process does not apply to CoCo because hotplug is not supported in TEE environment.
Proposal of dynamic resource allocation for CoCo
First of all the vcpus and memory requested should be aware before the VM is created. A mutating webhook which monitoring the POD creation behavior can help. When we create a POD, it reads the requested vcpus and memory from POD spec if it is set to calculate the total resource it needs:
total_vcpus = default_vcpus(1core) + requested_vcpus
total_memory = default_memory(2GB) + requested_memory
then the total resource will be set as the POD annotation in this webhook:
io.katacontainers.config.hypervisor.default_vcpus: total_vcpus
io.katacontainers.config.hypervisor.default_memory: total_memory
meanwhile the annotation setting for kata should be enabled in kata configuration:
enable_annotations = ["default_vcpus", "default_memory"]
In this way, sufficient resources should be able to be allocated for the POD.
