Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm reset breaks CNI for others #1822

Closed
afbjorklund opened this issue Oct 5, 2019 · 19 comments · Fixed by kubernetes/kubernetes#83950
Closed

kubeadm reset breaks CNI for others #1822

afbjorklund opened this issue Oct 5, 2019 · 19 comments · Fixed by kubernetes/kubernetes#83950
Assignees
Labels
area/ecosystem kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@afbjorklund
Copy link

afbjorklund commented Oct 5, 2019

What keywords did you search in kubeadm issues before filing this one?

  • reset
  • podman
  • /etc/cni/net.d

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:34:01Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:36:53Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"

  • Cloud provider or hardware configuration: VirtualBox
  • OS (e.g. from /etc/os-release): CentOS Linux 8 (Core)
  • Kernel (e.g. uname -a): 4.18.0-80.7.1.el8_0.x86_64
  • Others:

What happened?

When running kubeadm reset, it removed the entire /etc/cni/net.d directory.

This caused podman and any other tools using CNI networking to stop working...

What you expected to happen?

It should only remove the files that it installed itself, not the entire global directory.

In this case, it should only have removed /etc/cni/net.d/k8s.conf and not the rest.

How to reproduce it (as minimally and precisely as possible)?

sudo yum install podman
sudo minikube start --vm-driver=none (calls kubeadm init)
sudo minikube delete (calls kubeadm reset)
sudo podman run busybox

Anything else we need to know?

To restore previous functionality, one has to do: sudo yum reinstall podman.

The code is in: https://github.com/kubernetes/kubernetes/blob/v1.16.0/cmd/kubeadm/app/cmd/phases/reset/cleanupnode.go#L84


EDIT: Updated to make it clear that the commands used are coming from minikube

See kubernetes/minikube#5532 for the entire minikube context, running on CentOS 8.

@neolit123
Copy link
Member

In this case, it should only have removed /etc/cni/net.d/k8s.conf and not the rest.

how would we know what CNI plugin the user installed?

@neolit123 neolit123 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Oct 5, 2019
@neolit123 neolit123 added this to the v1.17 milestone Oct 5, 2019
@neolit123
Copy link
Member

@kubernetes/sig-cluster-lifecycle

@afbjorklund
Copy link
Author

In this case, it should only have removed /etc/cni/net.d/k8s.conf and not the rest.

how would we know what CNI plugin the user installed?

Right, it was actually minikube that installed the configuration file (and not kubeadm itself)

What I meant is that it probably shouldn't delete any config files that it hasn't installed itself ?

@rhatdan
Copy link

rhatdan commented Oct 6, 2019

You should at least check if the file is in the packaging database. Don't remove files installed by RPM or APT.

@neolit123
Copy link
Member

neolit123 commented Oct 6, 2019

Right, it was actually minikube that installed the configuration file (and not kubeadm itself)

given minikube installs files in that directory and given minikube uses kubeadm, it would be the responsibility of minikube to workaround potential conflicts or do required backups.

What I meant is that it probably shouldn't delete any config files that it hasn't installed itself ?

kubeadm itself does not install any files in the directory, it's the CNI plugin that does that and the CNI plugins is the responsibility of the user or higher level tools. however node cleanup is the responsibility of kubeadm reset.

if you wish to skip node cleanup you can use kubeadm reset --skip-phases=cleanup-node, this however will also skip cleanup of other locations. without propper node cleanup, subsequent kubeadm init calls can break for a variety of reasons.

example output from reset:

I1006 15:38:50.246717    2755 reset.go:98] [reset] Could not obtain a client set from the kubeconfig file: /etc/kubernetes/admin.conf
I1006 15:38:50.246851    2755 reset.go:116] [reset] Detected and using CRI socket: /var/run/dockershim.sock
[preflight] Running pre-flight checks
I1006 15:38:50.246892    2755 removeetcdmember.go:54] [reset] Checking for etcd config
W1006 15:38:50.246910    2755 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
I1006 15:38:50.246946    2755 cleanupnode.go:57] [reset] Getting init system
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
I1006 15:38:50.253172    2755 cleanupnode.go:79] [reset] Removing Kubernetes-managed containers
I1006 15:38:50.433197    2755 cleanupnode.go:87] [reset] Removing contents from the config and pki directories
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
I1006 15:38:50.433378    2755 reset.go:212] [reset] Deleting content of /var/lib/kubelet
I1006 15:38:50.433412    2755 reset.go:212] [reset] Deleting content of /etc/cni/net.d
I1006 15:38:50.433450    2755 reset.go:212] [reset] Deleting content of /var/lib/dockershim
I1006 15:38:50.433514    2755 reset.go:212] [reset] Deleting content of /var/run/kubernetes
I1006 15:38:50.433538    2755 reset.go:212] [reset] Deleting content of /var/lib/cni

so a question we can ask here is would it be OK for kubeadm to simply not remove /etc/cni/net.d and leave the remaining network configurations.

after doing some quick tests, first using calico leaving it's configuration and then using weave. the answer is that it's not OK. the presence of multiple CNI configurations in that directory breaks followup kubeadm init, as the CoreDNS pods get stuck in a ContainerCreating state.

@neolit123
Copy link
Member

neolit123 commented Oct 6, 2019

You should at least check if the file is in the packaging database. Don't remove files installed by RPM or APT.

CNI plugin files are not tracked in packaging databases, to my knowledge. so if the user has installed something custom in that directory kubeadm will clean that too.

another problem is that we don't wish to execute package manager semantics from kubeadm.

@neolit123
Copy link
Member

i have added this topic for the next kubeadm office hours agenda (this Wednesday).

@afbjorklund
Copy link
Author

given minikube installs files in that directory and given minikube uses kubeadm, it would be the responsibility of minikube to workaround potential conflicts or do required backups.

For minikube we will probably just do like CRI-O does, and document it as a quirk of kubeadm ?
Normally people use dedicated VM's just for minikube, so they wouldn't try to use CNI for others.

For the default option, where we create and destroy a VM just for minikube, we don't call "reset".
Instead the virtual machine is powered down and the disk destroyed. So it's only for the non-VM:

https://minikube.sigs.k8s.io/docs/reference/drivers/none/

CNI plugin files are not tracked in packaging databases, to my knowledge. so if the user has installed something custom in that directory kubeadm will clean that too.

In this case, the tracked file was installed by the podman system package in CentOS 8:

http://mirror.centos.org/centos/8.0.1905/AppStream/x86_64/os/Packages/podman-1.0.0-2.git921f98f.module_el8.0.0+58+91b614e7.x86_64.rpm

-rw-r--r-- 1 root root 483 Jun 25 16:45 /etc/cni/net.d/87-podman-bridge.conflist

However, kubeadm reset is not something everyone does... I guess reinstall "works" too:

sudo kubeadm reset
sudo yum reinstall podman

Thanks for looking into this issue.

@neolit123
Copy link
Member

For minikube we will probably just do like CRI-O does, and document it as a quirk of kubeadm ?

possibly referring to?
https://github.com/cri-o/cri-o/blob/master/tutorials/kubeadm.md#configuring-cni

First, CRI-O and kubeadm reset don't play well together, as kubeadm reset empties /etc/cni/net.d/. Therefore, it is good to change the crio.network.network_dir in crio.conf to somewhere kubeadm won't touch.

it would have been fine if kubeadm didn't clean this directory at all, except that multiple CNI networks, can break kubeadm init cluster creation.

In this case, the tracked file was installed by the podman system package in CentOS 8:

i figured that might be the case.

as pointed out earlier:

another problem is that we don't wish to execute package manager semantics from kubeadm.

it really feels kubeadm shouldn't get into the package manager support and abstraction.

However, kubeadm reset is not something everyone does...

it certainly a widely used command, especially in the baremetal case or in the case where users don't wish to re-create cloud provider or local VMs.

sudo kubeadm reset
sudo yum reinstall podman

this is a workaround, but obviously not ideal.

there is no way for kubeadm to know what files are owned by what application without package manager semantics, also, files in that directory may not be of package origin.

@afbjorklund
Copy link
Author

For minikube we will probably just do like CRI-O does, and document it as a quirk of kubeadm ?

possibly referring to?
https://github.com/cri-o/cri-o/blob/master/tutorials/kubeadm.md#configuring-cni

Yes, although I'm not sure I agree with the suggested cri-o workaround of moving the dir ?

So more likely it will be listed briefly under "paths that minikube delete will erase" or such

there is no way for kubeadm to know what files are owned by what application without package manager semantics, also, files in that directory may not be of package origin.

Indeed, the file created by minikube (k8s.conf) is not tracked by any package manager...

There is also no guarantee that kubeadm itself (or cri-o) is installed from such a package.

@neolit123
Copy link
Member

neolit123 commented Oct 6, 2019

Yes, although I'm not sure I agree with the suggested cri-o workaround of moving the dir ?

if cri-o is setup to use a different directory for it's CNI configuration, kubeadm will indeed not touch such a directory on reset. but CNI plugins configurations will still land in /etc/cni/net.d/, so such a node will not work, unless the user manually copies the CNI plugin configuration to the crio.network.network_dir directory.

i haven't experimented with that.

@bart0sh @kad how are you handling this in your cri-o usage?

@ereslibre
Copy link
Contributor

so a question we can ask here is would it be OK for kubeadm to simply not remove /etc/cni/net.d and leave the remaining network configurations.

I think this is a reasonable thing to do, not removing anything in that directory and printing a warning as we are doing with iptables rules.

@neolit123
Copy link
Member

I think this is a reasonable thing to do, not removing anything in that directory and printing a warning as we are doing with iptables rules.

yet, i'm seeing a breakage if both Calico and Weave configs are present in subsequent cluster creations.

added this as an agenda item for the office hours.

@rhatdan
Copy link

rhatdan commented Oct 7, 2019

Podman is definitely handled by a package manager.

rpm -qf /etc/cni/net.d/87-podman-bridge.conflist
podman-1.6.1-2.fc31.x86_64

@rhatdan
Copy link

rhatdan commented Oct 7, 2019

@mheon FYI.

Would CNI work if we installed the configuration in a location like /usr/share/cni/net.d?

@neolit123
Copy link
Member

Would CNI work if we installed the configuration in a location like /usr/share/cni/net.d?

the CNI plugins that i tested on Ubuntu always install configuration in /etc/cni/net.d/.

@mheon
Copy link

mheon commented Oct 7, 2019

@rhatdan We manually define the CNI config directories, so we could add an additional one if we wanted to. However, /etc/cni/net.d/ seems to be the standard, and I don't think other installed runtimes would see our CNI configs by default if we put them elsewhere. Also, I'm honestly not sure how precedence works across multiple directories, which could make default network selection very confusing.

@neolit123 neolit123 added the kind/design Categorizes issue or PR as related to design. label Oct 7, 2019
@neolit123
Copy link
Member

/assign @yastij

@neolit123
Copy link
Member

neolit123 commented Oct 9, 2019

@afbjorklund we discussed today during the kubeadm office hours and we decided that we are going to go with that proposal from @ereslibre :

I think this is a reasonable thing to do, not removing anything in that directory and printing a warning as we are doing with iptables rules.

@neolit123 neolit123 added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants