-
Notifications
You must be signed in to change notification settings - Fork 374
kata containers memory footprint #295
Comments
Thanks for opening - this is something I'm actively look at, as is @devimc. @mcastelino - FYI. A couple of things which came to mind when I started looking:
Can you check inside the guest the footprint of kata-agent? @mcastelino FYI. |
and does this mean the kata kernel consumes more memory? |
I think we need to (continue) to do some measurements of each piece. And, this highlights need for density CI to be merged. I'm working on making sure this happens ASAP. |
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container. fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
@gnawux @egernst, to be clear, I was using the same kernel and the same qemu binary to run the comparison. The only differences are:
One consequence of the large memory footprint is that we cannot boot a 128MB guest with kata (OOM killed), whereas runv can launch a 64MB guest. |
@bergwolf what was the base image used before in the initrd? From what I saw, the footprint of the agent wasn't that significant, but I'd curious to see the data from your side too. I understand the consequence of this -- KSM doesn't give us help with OOM inside the guest! I saw the same result with 128MB falling over. |
FYI: the CI has some density measurements per PR now: see http://kata-jenkins-ci.westus2.cloudapp.azure.com/job/kata-containers-runtime-density-PR/ The first test was just the standard baseline. You can see @devimc 's patch as job #2, which happens to have 100MB less footprint on the host. Of course, YMMV depending on the number of cores you have on the system... |
@egernst I was using alpine for the initrd image. And with #296, I'm seeing on a 128MB guest,
w/ kata + rootfs image
w/ runv
Still slightly larger but much better now. The large difference (~15MB) may largely be a result of hyperstart and kata agent binary size, since initramfs is uncompressed all into guest ram.
After striping the kata-agent binary, I'm getting with kata + initrd,
So the difference could be 5-8MB depending on whether initrd or rootfs image is used. |
Much better. We're still investigating to see if there's more "low hanging fruit" we can address. |
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ | | Memory Footprint (KB) | +---------------+-------------------------+ | NR_CPUS=240 | 186501 | +---------------+-------------------------+ | NR_CPUS=8 | 110684 | +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes kata-containers#295 Signed-off-by: Julio Montes <julio.montes@intel.com>
agent: disable yamux keep alive
The memory footprint of kata containers is pretty high even inside the guest. Comparing to what runv+hyperstart has for a 256MB busybox container:
kata container only leaves less than 120MB free memory for the container app:
There are two places we need to look into:
system-memory=\"200572 kB
.@sboeuf @WeiZhang555 @amshinde @devimc @egernst @laijs any thought?
The text was updated successfully, but these errors were encountered: