-
Notifications
You must be signed in to change notification settings - Fork 49
Add memory requests for nodes and controlplane workloads #49
base: master
Are you sure you want to change the base?
Conversation
c24096b
to
7133e9e
Compare
assets/lokomotive-kubernetes/bootkube/resources/charts/kubernetes/templates/kube-scheduler.yaml
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding these fields makes sense, but not sure which was the reasoning to pick them. Can you please share the reasoning (and add it to the commit)?
I can't really review if the numbers make sense without knowing how they were calculated :)
@@ -95,7 +95,8 @@ systemd: | |||
--pod-manifest-path=/etc/kubernetes/manifests \ | |||
--read-only-port=0 \ | |||
--register-with-taints=$${NODE_TAINTS} \ | |||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins | |||
--volume-plugin-dir=/var/lib/kubelet/volumeplugins \ | |||
--kube-reserved=memory=500Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this value come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values are a snapshot of kubectl top pods --sort-by=memory | sort -h -k3 -r
after deploying a cluster from example
configuration. They should perhaps be configurable, as depending on the cluster size, they will be changing.
Those values are minimal, just to have something in place.
Should I add such comment in commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do.
I'd like to get some values from production clusters maybe too... But we can update later, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, kubelet+docker was using 500mb?
Note ssh
, etc. should go under --system-reserved
, say the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. I'll use system-reserved then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL.
assets/lokomotive-kubernetes/bootkube/resources/charts/calico/templates/daemonset.yaml
Show resolved
Hide resolved
7133e9e
to
1357be0
Compare
Right, sorry for not providing that straight away. I've added methods of measuring etc to the commit message now. Please take a look. |
aefa784
to
ac21d82
Compare
ac21d82
to
5bb18fa
Compare
This commit adds memory requests for all nodes and controlplane workloads. The reason behind it is to better show user available resources on both worker and controller nodes e.g. when doing 'kubectl describe node'. This is important while one scales the controlplane deployments and may prevent node eviction. The measurment was done on freshly created cluster, with prometheus-operator and metrics-server deployed, on controller node and on worker node, so the numbers might be lower than the numbers on long-running cluster, but they give at least some initial visibility. The values were measured using 'systemd-cgtop -m -1 / --depth=1', not using 'free', as 'systemd-cgtop' also includes page cache usage, the same way 'kubelet' is measuring the memory usage. Before the measurment, following command has been executed: 'sync; echo 1 | sudo tee /proc/sys/vm/drop_caches; sleep 10' To make sure only active memory has been captured. system.slice uses ~250Mi, init.scope uses ~200Mi, which sums up to roughly 500Mi, which is needed for system. Kubelet in /docker slice was using ~100Mi. etcd in /docker slice was using ~200Mi, so workers has 100Mi reserved for 'kube' and controllers has 300Mi. Memory usage for self-hosted components has been measured using the following command: 'kubectl top pods --sort-by=memory | sort -h -k3 -r'. Then, the read values were rounded up a bit. Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
To avoid duplicating the template logic and to make the configuration more readable, as having quote (") right before the template logic is very confusing. Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
5bb18fa
to
f907568
Compare
This seems like something that would be done inside slices of cgroups on the host, to ensure the system can still operate. I guess it seems arbitrary and inflexible to declare numbers like this. Right? |
This commit adds memory requests for all nodes and controlplane
workloads. The reason behind it is to better show user available
resources on both worker and controller nodes e.g. when doing 'kubectl
describe node'. This is important while one scales the controlplane
deployments and may prevent node eviction.
The measurment was done on freshly created cluster, with
prometheus-operator and metrics-server deployed,
on controller node and on worker node, so the numbers might be lower
than the numbers on long-running cluster, but they give at least some
initial visibility.
The values were measured using 'systemd-cgtop -m -1 / --depth=1', not
using 'free', as 'systemd-cgtop' also includes page cache usage, the
same way 'kubelet' is measuring the memory usage.
Before the measurment, following command has been executed:
'sync; echo 1 | sudo tee /proc/sys/vm/drop_caches; sleep 10'
To make sure only active memory has been captured.
system.slice uses ~250Mi, init.scope uses ~200Mi, which sums up to
roughly 500Mi, which is needed for system.
Kubelet in /docker slice was using ~100Mi. etcd in /docker slice was
using ~200Mi, so workers has 100Mi reserved for 'kube' and controllers
has 300Mi.
Memory usage for self-hosted components has been measured using the
following command: 'kubectl top pods --sort-by=memory | sort -h -k3 -r'.
Then, the read values were rounded up a bit.
Signed-off-by: Mateusz Gozdek mateusz@kinvolk.io