[RFC] proposal of dynamic resource allocation for CoCo

### Problem statement
- default_vcpus and default_memory are specify statically in kata configuration file configuration.toml, like [https://github.com/confidential-containers/kata-containers-CCv0/blob/CCv0/src/runtime/config/configuration-qemu-tdx.toml.in](url)
- Currently TEE does not support CPU and memory hot plug features because of security consideration.
- large POD (e.g. containers with large size models or containers running heavy AI workloads) may need to request more CPUs and memory than default settings by declaring the resource requested and resource limitation in POD yaml file, but this resource definition does not take effect in TEE environment.
### Kata boot flow
![image](https://user-images.githubusercontent.com/21115922/231926547-14f70b90-09f5-4d53-8cd4-9025a3b7590f.png)

When creating a POD, it CreateSandbox() first, and default_vcpus and default_memory will be allocated for the VM itself, and then it creates the container, at this moment, more vcpus and memory may be hot plugged and allocated for the container if it is defined in POD yaml. But this resource allocation process does not apply to CoCo because hotplug is not supported in TEE environment.
### Proposal of dynamic resource allocation for CoCo
First of all the vcpus and memory requested should be aware before the VM is created. A mutating webhook which monitoring the POD creation behavior can help. When we create a POD, it reads the requested vcpus and memory from POD spec if it is set to calculate the total resource it needs: 

`total_vcpus = default_vcpus(1core) + requested_vcpus`
`total_memory = default_memory(2GB) + requested_memory`

then the total resource will be set as the POD annotation in this webhook:

`io.katacontainers.config.hypervisor.default_vcpus: total_vcpus`
`io.katacontainers.config.hypervisor.default_memory: total_memory`

meanwhile the annotation setting for kata should be enabled in kata configuration:
`enable_annotations = ["default_vcpus", "default_memory"]`

In this way, sufficient resources should be able to be allocated for the POD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] proposal of dynamic resource allocation for CoCo #117

Problem statement

Kata boot flow

Proposal of dynamic resource allocation for CoCo

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] proposal of dynamic resource allocation for CoCo #117

Description

Problem statement

Kata boot flow

Proposal of dynamic resource allocation for CoCo

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions