You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i don't exactly know why or how, but it seems that the etcd data dir is not empty when a control plane node is built using etcd_volume_size=10 and etcd_volume_type=encrypted-volumes in our deployment.
For example, here is a control plane node with the lost+found folder inside /var/lib/etcd:
It looks (to me) that there is some sort of timing problem, where Cloud-Init fails due to the folder not being empty before the prekubeadmcommands even run?
Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 audit: BPF prog-id=20 op=UNLOAD Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:23] Cloud-init v. 23.1.2-0ubuntu0~22.04.1 running 'modules:final' at Tue, 19 Mar 2024 11:22:22 +0000. Up 21.20 sec> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [init] Using Kubernetes version: v1.27.3 Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [preflight] Running pre-flight checks Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] error execution phase preflight: [preflight] Some fatal errors occurred: Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=... Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] To see the stack trace of this error execute with --v=5 or higher Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] 2024-03-19 11:22:24,453 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] 2024-03-19 11:22:24,454 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] Cloud-init v. 23.1.2-0ubuntu0~22.04.1 finished at Tue, 19 Mar 2024 11:22:24 +0000. Datasource DataSourceOpenSt> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 systemd[1]: dmesg.service: Deactivated successfully. Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=dmesg comm="systemd" exe="/usr/lib/systemd/systemd" hostn> Mar 19 11:22:27 kube-bxbqe-xx6sh-cndp2 chronyd[818]: Selected source 158.101.188.125 (2.ubuntu.pool.ntp.org) Mar 19 11:22:27 kube-bxbqe-xx6sh-cndp2 chronyd[818]: System clock wrong by -278.022585 seconds Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox im> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: I0319 11:22:00.003195 1144 server.go:199] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: E0319 11:22:00.003494 1144 run.go:74] "command failed" err="failed to load kubelet config file, error: failed to load Kubelet confi> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 systemd[1]: kubelet.service: Failed with result 'exit-code'.
i don't exactly know why or how, but it seems that the etcd data dir is not empty when a control plane node is built using etcd_volume_size=10 and etcd_volume_type=encrypted-volumes in our deployment.
For example, here is a control plane node with the lost+found folder inside /var/lib/etcd:
It looks (to me) that there is some sort of timing problem, where Cloud-Init fails due to the folder not being empty before the prekubeadmcommands even run?
Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 audit: BPF prog-id=20 op=UNLOAD Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:23] Cloud-init v. 23.1.2-0ubuntu0~22.04.1 running 'modules:final' at Tue, 19 Mar 2024 11:22:22 +0000. Up 21.20 sec> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [init] Using Kubernetes version: v1.27.3 Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [preflight] Running pre-flight checks Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] error execution phase preflight: [preflight] Some fatal errors occurred: Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] [preflight] If you know what you are doing, you can make a check non-fatal with
--ignore-preflight-errors=...Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] To see the stack trace of this error execute with --v=5 or higher Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] 2024-03-19 11:22:24,453 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] 2024-03-19 11:22:24,454 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 cloud-init[901]: [2024-03-19 11:22:24] Cloud-init v. 23.1.2-0ubuntu0~22.04.1 finished at Tue, 19 Mar 2024 11:22:24 +0000. Datasource DataSourceOpenSt> Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 systemd[1]: dmesg.service: Deactivated successfully. Mar 19 11:22:24 kube-bxbqe-xx6sh-cndp2 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=dmesg comm="systemd" exe="/usr/lib/systemd/systemd" hostn> Mar 19 11:22:27 kube-bxbqe-xx6sh-cndp2 chronyd[818]: Selected source 158.101.188.125 (2.ubuntu.pool.ntp.org) Mar 19 11:22:27 kube-bxbqe-xx6sh-cndp2 chronyd[818]: System clock wrong by -278.022585 seconds Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox im> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: I0319 11:22:00.003195 1144 server.go:199] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 kubelet[1144]: E0319 11:22:00.003494 1144 run.go:74] "command failed" err="failed to load kubelet config file, error: failed to load Kubelet confi> Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Mar 19 11:22:00 kube-bxbqe-xx6sh-cndp2 systemd[1]: kubelet.service: Failed with result 'exit-code'.
We are running:
Originally posted by @robincron in #305 (comment)
The text was updated successfully, but these errors were encountered: