Replies: 3 comments 9 replies
-
If stopping the k3s service doesn't free the memory, then whatever's going on is not a K3s issue. etcd is run in the main k3s process; if the memory isn't freed when that process is terminated then something is being leaked (or at least consumed and not freed) within the kernel itself. Have you looked at overlayfs or other things like that as the cause of the growing memory utilization? If you run k3s-killall.sh to terminate all the pods and clean up the iptables rules, is the memory freed? |
Beta Was this translation helpful? Give feedback.
-
After a little more troubleshooting and re-testing, I'd like to call out a change in my original post to clarify that, Stopping |
Beta Was this translation helpful? Give feedback.
-
Update on this issue, It's completely unrelated to K3S. Very hard to track down, but it's related to using ESXI to pass-through a IGPU on a Intel NUC -- Disabling the integrated graphics card with the flag mentioned in the post above stops the mysterious memory leak. |
Beta Was this translation helpful? Give feedback.
-
I've been battling a memory leak for some time now when using K3S in a multi-master HA setup. I'm running 3x nodes --
k3s-master-2 - Ubuntu 20.04 - (4vCPU 8GB Memory) - 5.4.0-162-generic - No memory leaks observed
k3s-master-5 - Ubuntu 22.04 (8vCPU 32GB Memory) - 5.15.0-75-generic - Leaks memory quickly (100mb per hour)
k3s-master-6 - Ubuntu 22.04 (10vCPU 16GB Memory) - 5.15.0-83-generic - Leaks memory slower (20-50mb per hour)
k3s
systemd unit settings:All running in ESXI on NVME w/ 10gb networking. All machines have minimal workloads - but
k3s-master-5
has the most at the moment. I'm running Longhorn 1.4.5 for storage.For the past few weeks, I've been noticing a memory leak on the Ubuntu 22.04 nodes. It's particularly pronounced on one of them, And I've ruled out any running containers as being the cause. I see the "memory used"(
node_memory_MemTotal_bytes - node_memory_MemFree_bytes- node_memory_Cached_bytes - node_memory_Buffers_bytes
) rising, and "memory available" (node_memory_MemAvailable_bytes
) falling.On the node in question, I see high slab memory usage (
SUnreclaim
) - rising over time,Using
smem
I can see that the memory usage is coming from "Kernel Dynamic Memory" (Specifically non-cache) - AKA - not a userland process.slabtop
shows high usage ofkmalloc-256
that rises over time. My behavior follows this bug - but that's been fixed in the kernel I'm running - so i've ruled that out.Stopping the
k3s-service
process stops the memory leak. It also releases the kernel memory I observed in use. Runningk3s-killall.sh
also clears further memory.I suspect this is related to a process it's running, specifically
etcd
. I upgraded from 1.23 > 1.24 > 1.25 and the issue persisted. The odd thing is that it doesn't appear to affect ubuntu 20.04 , only 22.04 .This has been driving me insane for many weeks, I'd appreciate any insight. I see multiple threads discussing memory + cpu utilization issues in general here - But none of them appear to be this exact issue.
Beta Was this translation helpful? Give feedback.
All reactions