-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found #7083
Comments
On a side node, Kubespray ran once from 1.18.9 to 1.19.5. Some components, like kube-apiserver, kube-scheduler and kube-controller, were upgraded. But shortly after, kubeadm died with the error message:
Subsequent Kubespray runs insist on failing with the same error message. |
Noted from history... The ansible task that deletes the Except, as of Feb/2020 kubeadm checks for the existence of this exact Service name during upgrades. One of their issues was closed recently, blaming Kubespray for the error. It seems that changing the service name from |
I managed to workaround the issue by obtaining a copy of the coredns svc from the cluster:
... reapply a refurbished copy of the same Yaml to create a service named With that svc copy present, the upgrade worked out fine. Merry X-mas! |
I've run into this problem as well. If you use nodelocaldns, in addition to deleting and renaming coredns service to kube-dns you also need to update the arguments on the nodelocaldns daemonset.
|
If the best path forward is to rename the service back to kube-dns, it looks like the deletion and recreation of the service (which happens today on every run) will need to take place earlier in order to occur prior to running the upgrade. |
For anyone not experiencing issues on upgrade, I'd be curious what your coreDNS service IP is, and whether kubeadm created a kube-dns service on .10 of your If coredns is configured to use the X.X.X.10 address and you try @juliohm1978's work around of creating a copy of the service called |
This isn't the most elegant solution, but looks like there is already a PR out there to simply ignore upgrade errors when kube-dns doesn't exist or it wants to change the IP - #6244 |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Rotten issues close after 30d of inactivity. Send feedback to sig-contributor-experience at kubernetes/community. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The issue persists, even on Kubespray 2.17.x. For anyone bumping into this, recreating the If you hit @dlouks's problem, where
In this case, the expected IP would be
|
@juliohm1978, I think the fix is in #6244. Looks like it needs a little rework now that kubernetes/master role moved to control-plane. |
Environment:
Cloud provider or hardware configuration: barebone installation - VMs
OS (
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
):Version of Ansible (
ansible --version
): 2.9.13Version of Python (
python --version
): Python 2.7.17Kubespray version (commit) (
git rev-parse --short HEAD
): v2.14.2 (75d648c)Network plugin used: calico
Full inventory with variables (
ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"
):inventory.txt
Command used to invoke ansible:
Output of ansible run:
output.txt
Anything else do we need to know:
I've been trying to upgrade the cluster from 1.18.x to 1.19.x and I keep getting this error message. The isolated kubeadm output follows:
I tried setting CoreDNS to 1.7.0, which is what kubeadm would support under 1.19.5, but still no luck.
Any ideas what could be causing this?
The text was updated successfully, but these errors were encountered: