Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing master IP address #338

Closed
analytik opened this issue Jul 6, 2017 · 31 comments
Closed

Changing master IP address #338

analytik opened this issue Jul 6, 2017 · 31 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@analytik
Copy link

analytik commented Jul 6, 2017

I'm using a provider that dynamically assigns private IP addresses on node startup, and it seems to break kubeadm-based setup.

I have set up brand new master server with kubeadm, and it worked well, but after shutting down and bringing the machine back up, the private IP address has changed, and now when using kubectl I get an error Unable to connect to the server: x509: certificate is valid for 10.96.0.1, 10.4.36.13, not 10.4.20.67
(The latter being the new IP address of the master server.)

Is there a way to run kubeadm init in a way to reset the configuration? E.g. I want to keep the cluster pods, RCs, etc, but I want to re-init the certificate to use a hostname instead of IP address.

When I try running init again with hostname instead of the default IP address, it disagrees with me:

[06:20 root@scumbag01 ~] > kubeadm init --apiserver-advertise-address scumbag01 --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[certificates] Using the existing CA certificate and key.
[certificates] Using the existing API Server certificate and key.
[certificates] Using the existing API Server kubelet client certificate and key.
[certificates] Using the existing service account token signing key.
[certificates] Using the existing front-proxy CA certificate and key.
[certificates] Using the existing front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
a kubeconfig file "/etc/kubernetes/admin.conf" exists already but has got the wrong API Server URL

It picks up the now unusable certificate for 10.4.36.13, which is an IP address outside of my control instead of resetting it.

If I remove /etc/kubernetes/*.conf, and re-run the init above it still writes server: https://10.4.20.67:6443 instead of using the hostname.

Should kubeadm init overwrite the setting and create a new certificate? Is there a plan to add kubeadm reset or similar functionality that would reset the cluster, or destroy all artifacts created by previous kubeadm init so that I can have a fresh start?

  • kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T22:55:19Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes version: 1.7.0
  • Cloud provider or hardware configuration: Scaleway, Intel ATOM x64
  • OS (e.g. from /etc/os-release): Debian Jessie
  • Kernel: 4.9.20
@luxas
Copy link
Member

luxas commented Jul 6, 2017

This is not a limitation by kubeadm, but just general security practice.
The certificate is signed for {your-old-IP-here} and secure communication can't then happen to {your-new-ip-here}

You can add more IPs in the certificate in beforehand though...

@luxas luxas added the kind/support Categorizes issue or PR as a support question. label Jul 6, 2017
@analytik
Copy link
Author

analytik commented Jul 6, 2017

Thank you for your response.

As the IP addresses are assigned by the cloud provider, so generating certificate beforehand would only work if I could set it to a wildcard. (Sorry, I know nothing about certificates.)

I overlooked that kubeadm reset actually exists, since it's not mentioned in the reference guide. Reset and init worked well enough for me, and I guess I will avoid shutting down the master machine - I assume my problem is rare and far from any production use cases. Still, I wonder if there's a better way. I guess I could mimic kubeadm reset steps, but keep the the etcd data folder to preserve my cluster setup?

Either way, thank you for all the work done on kubeadm! It's magical to see the cluster come up in minutes - I've been using Kubernetes since 0.14, in production since 1.0.

@spnzig
Copy link

spnzig commented Aug 4, 2017

@analytik i have exactly the same problem as yours. My corporate network blocks gcr.io . So i am using a dongle for the install. However the provider IP keeps changing dynamically and is not under my control. So even i am looking out for a solution. Even if i keep my dongle plugged in, sometimes due to network resets the IP changes. Do you have any solution to this? How are you handling this?
@luxas could you please suggest on how I can proceed. I am a newbie to K8S. Completely lost with this configuration. Could you let me know how I can fix this dynamic IP issue?

@372046933
Copy link

How do you guys deal with the changed master IP?

@saravanaksk1982
Copy link

Is there any update on this issue?

@GabMgt
Copy link

GabMgt commented Jul 24, 2018

Same problem here. Any documentation to proceed a master ip modification without resetting the entire cluster please ?

@patricklucas
Copy link

patricklucas commented Jul 24, 2018

I was able to accomplish this by:

  • replacing the IP address in all config files in /etc/kubernetes
  • backing up /etc/kubernetes/pki
  • identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name[1]
  • deleting both the cert and key for each of them (for me it was just apiserver and etcd/peer)
  • regenerating the certs using kubeadm alpha phase certs[2]
  • identifying configmap in the kube-system namespace that referenced the old IP[3]
  • manually editing those configmaps
  • restarting kubelet and docker (to force all containers to be recreated)

[1]

/etc/kubernetes/pki# for f in $(find -name "*.crt"); do openssl x509 -in $f -text -noout > $f.txt; done
/etc/kubernetes/pki# grep -Rl 12\\.34\\.56\\.78 .
./apiserver.crt.txt
./etcd/peer.crt.txt
/etc/kubernetes/pki# for f in $(find -name "*.crt"); do rm $f.txt; done

[2]

/etc/kubernetes/pki# rm apiserver.crt apiserver.key
/etc/kubernetes/pki# kubeadm alpha phase certs apiserver
...
/etc/kubernetes/pki# rm etcd/peer.crt etcd/peer.key
/etc/kubernetes/pki# kubeadm alpha phase certs etcd-peer
...

[3]

$ kubectl -n kube-system get cm -o yaml | less
...
$ kubectl -n kube-system edit cm ...

@GabMgt
Copy link

GabMgt commented Jul 25, 2018

Wow, I was unaware of these commands. Great infos, that did the trick. Thank you !

@mariamTr
Copy link

mariamTr commented Aug 9, 2018

is there a way to find the configmaps manually and change them ?

@Reasno
Copy link

Reasno commented Aug 16, 2018

I hope kubeadm can cover this process in a future release.

@weisjohn
Copy link

weisjohn commented Sep 5, 2018

@patricklucas seriously, thank you for that write-up. It saved my life.

For those looking for even more clarity, here were my experiences:

  1. replace the IP address in all config files in /etc/kubernetes

    oldip=192.168.1.91
    newip=10.20.2.210
    cd /etc/kubernetes
    # see before
    find . -type f | xargs grep $oldip
    # modify files in place
    find . -type f | xargs sed -i "s/$oldip/$newip/"
    # see after
    find . -type f | xargs grep $newip
  2. backing up /etc/kubernetes/pki

    mkdir ~/k8s-old-pki
    cp -Rvf /etc/kubernetes/pki/* ~/k8s-old-pki
  3. identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name (this could be cleaned up)

    cd /etc/kubernetes/pki
    for f in $(find -name "*.crt"); do 
      openssl x509 -in $f -text -noout > $f.txt;
    done
    grep -Rl $oldip .
    for f in $(find -name "*.crt"); do rm $f.txt; done
  4. identify configmap in the kube-system namespace that referenced the old IP, edit them:

    # find all the config map names
    configmaps=$(kubectl -n kube-system get cm -o name | \
      awk '{print $1}' | \
      cut -d '/' -f 2)
    
    # fetch all for filename reference
    dir=$(mktemp -d)
    for cf in $configmaps; do
      kubectl -n kube-system get cm $cf -o yaml > $dir/$cf.yaml
    done
    
    # have grep help you find the files to edit, and where
    grep -Hn $dir/* -e $oldip
    
    # edit those files, in my case, grep only returned these two:
    kubectl -n kube-system edit cm kubeadm-config
    kubectl -n kube-system edit cm kube-proxy
  5. change the IP address (via cli or gui for your distro)

  6. delete both the cert and key for each identified by grep in the prior step, regenerate those certs

    NOTE: prior to recreating the certs via kubeadm admin phase certs ..., you'll need to have the new IP address applied

    rm apiserver.crt apiserver.key
    kubeadm alpha phase certs apiserver
    
    rm etcd/peer.crt etcd/peer.key
    kubeadm alpha phase certs etcd-peer
  7. restart kubelet and docker

    sudo systemctl restart kubelet
    sudo systemctl restart docker
  8. copy over the new config

    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config

@mariamTr ^

@weisjohn
Copy link

weisjohn commented Sep 5, 2018

another thing to note, changing the certs was possible in offline mode by specifying the k8s version in a config file: kubernetes/kubernetes#54188 (comment)

@michaelfig
Copy link

@weisjohn Could you also please update your comment by noting that:

kubectl edit cm -nkube-public cluster-info

is also needed for kubeadm?

Otherwise, my kubeadm join commands keep failing by using the old/wrong apiserver IP halfway through the process.

Thanks!

@vdboor
Copy link

vdboor commented Nov 28, 2018

I've applying all the steps from by @weisjohn (#338 (comment)) and @michaelfig (#338 (comment)) to replace the address everywhere.

This is used to let kubernetes use the newly created VPC address on eth1, instead of the public IP on eth0. Yet when I run kubeadm upgrade diff v1.12.3 it still wants to revert the changes to --advertise-address in /etc/kubernetes/manifests/kube-apiserver.yaml.

Any clues?

Even in kubectl get all --export=true --all-namespaces -o yaml the old IP isn't present anywhere

Update: it turns that that kubeadm upgrade diff did suggest a change, but kubeadm upgrade apply didn't actually change the address at all. (one of many bugs kubernetes 1.13 like fixes)

@RatanVMistry
Copy link

@weisjohn Thank you for

@patricklucas seriously, thank you for that write-up. It saved my life.

For those looking for even more clarity, here were my experiences:

  1. replace the IP address in all config files in /etc/kubernetes

    oldip=192.168.1.91
    newip=10.20.2.210
    cd /etc/kubernetes
    # see before
    find . -type f | xargs grep $oldip
    # modify files in place
    find . -type f | xargs sed -i "s/$oldip/$newip/"
    # see after
    find . -type f | xargs grep $newip
  2. backing up /etc/kubernetes/pki

    mkdir ~/k8s-old-pki
    cp -Rvf /etc/kubernetes/pki/* ~/k8s-old-pki
  3. identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name (this could be cleaned up)

    cd /etc/kubernetes/pki
    for f in $(find -name "*.crt"); do 
      openssl x509 -in $f -text -noout > $f.txt;
    done
    grep -Rl $oldip .
    for f in $(find -name "*.crt"); do rm $f.txt; done
  4. identify configmap in the kube-system namespace that referenced the old IP, edit them:

    # find all the config map names
    configmaps=$(kubectl -n kube-system get cm -o name | \
      awk '{print $1}' | \
      cut -d '/' -f 2)
    
    # fetch all for filename reference
    dir=$(mktemp -d)
    for cf in $configmaps; do
      kubectl -n kube-system get cm $cf -o yaml > $dir/$cf.yaml
    done
    
    # have grep help you find the files to edit, and where
    grep -Hn $dir/* -e $oldip
    
    # edit those files, in my case, grep only returned these two:
    kubectl -n kube-system edit cm kubeadm-config
    kubectl -n kube-system edit cm kube-proxy
  5. change the IP address (via cli or gui for your distro)

  6. delete both the cert and key for each identified by grep in the prior step, regenerate those certs

    NOTE: prior to recreating the certs via kubeadm admin phase certs ..., you'll need to have the new IP address applied

    rm apiserver.crt apiserver.key
    kubeadm alpha phase certs apiserver
    
    rm etcd/peer.crt etcd/peer.key
    kubeadm alpha phase certs etcd-peer
  7. restart kubelet and docker

    sudo systemctl restart kubelet
    sudo systemctl restart docker
  8. copy over the new config

    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config

@mariamTr ^

Thank you for steps.
Can you provide more like what changes we need to do on master node and after that what procedure we need to apply for old worker node to join that reconfigured master node?

Thanks in advance :)

@vdboor
Copy link

vdboor commented Dec 5, 2018

Perhaps good to mention, when moving the master IP to a private network it could be useful to update the overlay network too. Calico wasn't using the VPC interface until it was bound to that interface:

         env:
            - name: IP_AUTODETECTION_METHOD
              value: interface=eth1

@sunlove123
Copy link

sunlove123 commented Dec 13, 2018

kubeadm alpha phase certs apiserver

@weisjohn kubeadm alpha phase certs apiserver is not working in v1.13.0, showing "This command is not meant to be run on its own. See list of available subcommands." any updated comments available?

@neolit123
Copy link
Member

in 1.13 the command is called kubeadm init phase certs apiserver:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/#cmd-phase-certs

@bboreham
Copy link

bboreham commented Jan 3, 2019

Very useful steps to remedy - thanks @patricklucas and @weisjohn !

One extra tip if, like me, you begin from the state that the IP address has already changed, so you cannot contact the api-server to change the configmaps in step 4:
The api-server certificate is signed for hostname kubernetes, so you can add that as an alias to the new IP address in /etc/hosts then do kubectl --server=https://kubernetes:6443 ....

@amravyan
Copy link

@bboreham @weisjohn @patricklucas Thanks a lot for your experience. Could you please give an advice, what should I do on worker nodes after changing ip on master node?
Delete/add it to cluster? Or just change /etc/kubernetes/kubelet.conf and /etc/kubernetes/pki/ca.crt manually?

@valerius257
Copy link

valerius257 commented Feb 6, 2019

I know it's an old issue but maybe my comment will be of use to someone.
Unfortunately solution proposed by @patricklucas and @weisjohn haven't worked for me so I've created my own:

systemctl stop kubelet docker

cd /etc/

# backup old kubernetes data
mv kubernetes kubernetes-backup
mv /var/lib/kubelet /var/lib/kubelet-backup

# restore certificates
mkdir -p kubernetes
cp -r kubernetes-backup/pki kubernetes
rm kubernetes/pki/{apiserver.*,etcd/peer.*}

systemctl start docker

# reinit master with data in etcd
# add --kubernetes-version, --pod-network-cidr and --token options if needed
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd

# update kubectl config
cp kubernetes/admin.conf ~/.kube/config

# wait for some time and delete old node
sleep 120
kubectl get nodes --sort-by=.metadata.creationTimestamp
kubectl delete node $(kubectl get nodes -o jsonpath='{.items[?(@.status.conditions[0].status=="Unknown")].metadata.name}')

# check running pods
kubectl get pods --all-namespaces

@nolmit
Copy link

nolmit commented Mar 8, 2019

@valerius257 thank you man, you save our weekend)

@sachinjambhulkar
Copy link

sachinjambhulkar commented Mar 19, 2019

Thanks @valerius257 👍
I have tried all write-ups / instructions from @patricklucas and @weisjohn . They didn't work for my cluster. The good part is that those instructions highlights some of the key aspects of certificates and keys and at what timeline they needs to take care.

The instruction mentioned by @valerius257 worked seamlessly, till I hit issues which are very specific to my kubeadm master node. I was trying to recover kubeadm Master Node whose IP got changed.

Post continuation of steps mentioned by @valerius257
I was using flannel n/w plugin on one single master node.
Flannel Issue: kube-flannel-ds-xxxx back-off restarting failed container
Pod State: CrashLoopBackOff. Due to this other Pods like core-dns-xxx also failing to come up.

Resolution: As i have initiated the cluster with kubeadm init with cidr n/w (when IP was old or while commissioning the master node), following step has wiped the cidr settings from "/etc/kubernetes/manifests/kube-controller-manager.yaml" file.
kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd.

Hence, if you have initiated the kubeadm master node(with 1st time IP Address) with command "kubeadm init --token {{ kubeadm_token }} --pod-network-cidr=10.244.0.0/16" ", then post allocation of new IP you should execute following command with --pod-network-cidr=10.244.0.0/16.
" kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd --token {{ kubeadm_token }} --pod-network-cidr=10.244.0.0/16"

Or modify the file "/etc/kubernetes/manifests/kube-controller-manager.yaml with following parameters included , if they are missing under Spec:containers:command:

  • --allocate-node-cidrs=true
  • --cluster-cidr=10.244.0.0/16
  • --node-cidr-mask-size=24
    Reference: pod cidr not assgned flannel-io/flannel#728, read the solution from @wkjun
    Once above changes are in place,
    systemctl stop kubelet docker
    sleep 20
    systemctl start docker kubelet
    Check all pods are up and running including flannel.
    kubect get pods -n kube-system

Issue 2:
All pods at application namespace or on kube-system starts showing Errors in describe pod commands something like:
"Warning FailedScheduling default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate. "
Execute the command: kubectl taint nodes --all node-role.kubernetes.io/master-
describe all pods running at apps workspace or at kube-system namespace, errors mentioned wont be observed. In muti-node cluster you may have to be extra cautious.

@VipinKrizz
Copy link

@patricklucas seriously, thank you for that write-up. It saved my life.

For those looking for even more clarity, here were my experiences:

  1. replace the IP address in all config files in /etc/kubernetes

    oldip=192.168.1.91
    newip=10.20.2.210
    cd /etc/kubernetes
    # see before
    find . -type f | xargs grep $oldip
    # modify files in place
    find . -type f | xargs sed -i "s/$oldip/$newip/"
    # see after
    find . -type f | xargs grep $newip
  2. backing up /etc/kubernetes/pki

    mkdir ~/k8s-old-pki
    cp -Rvf /etc/kubernetes/pki/* ~/k8s-old-pki
  3. identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name (this could be cleaned up)

    cd /etc/kubernetes/pki
    for f in $(find -name "*.crt"); do 
      openssl x509 -in $f -text -noout > $f.txt;
    done
    grep -Rl $oldip .
    for f in $(find -name "*.crt"); do rm $f.txt; done
  4. identify configmap in the kube-system namespace that referenced the old IP, edit them:

    # find all the config map names
    configmaps=$(kubectl -n kube-system get cm -o name | \
      awk '{print $1}' | \
      cut -d '/' -f 2)
    
    # fetch all for filename reference
    dir=$(mktemp -d)
    for cf in $configmaps; do
      kubectl -n kube-system get cm $cf -o yaml > $dir/$cf.yaml
    done
    
    # have grep help you find the files to edit, and where
    grep -Hn $dir/* -e $oldip
    
    # edit those files, in my case, grep only returned these two:
    kubectl -n kube-system edit cm kubeadm-config
    kubectl -n kube-system edit cm kube-proxy
  5. change the IP address (via cli or gui for your distro)

  6. delete both the cert and key for each identified by grep in the prior step, regenerate those certs

    NOTE: prior to recreating the certs via kubeadm admin phase certs ..., you'll need to have the new IP address applied

    rm apiserver.crt apiserver.key
    kubeadm alpha phase certs apiserver
    
    rm etcd/peer.crt etcd/peer.key
    kubeadm alpha phase certs etcd-peer
  7. restart kubelet and docker

    sudo systemctl restart kubelet
    sudo systemctl restart docker
  8. copy over the new config

    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config

@mariamTr ^

in the place of newip which ip should we give?
can we create an ip of our own?

@bboreham
Copy link

bboreham commented Dec 3, 2019

@VipinKrizz the context of this issue is that the IP already changed due to factors within the infrastructure. Nobody can answer which IP you should use except someone familiar with your particular set-up.

Maybe you can find someone to have a chat with about this on Slack? Kubeadm issues are not the right place.

@weisjohn
Copy link

weisjohn commented Dec 6, 2019

@valerius257 thanks much for that script, I now see a number of downsides in my approach. I can confirm that your solution worked, however, there's lots of little edges (as in all of k8s). I had to re-apply any patches to enabled services / built-ins, dns, special storage classes, etc.

But yeah, your script saved my bacon today.

@rajibul007
Copy link

@valerius257 I followed your step but getting below issue

root@ubuntu:/etc/kubernetes/pki# kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
W0122 10:15:34.819150 102032 version.go:101] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: dial tcp: lookup dl.k8s.io on 127.0.0.53:53: server misbehaving
W0122 10:15:34.819340 102032 version.go:102] falling back to the local client version: v1.16.3
[init] Using Kubernetes version: v1.16.3
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [ubuntu kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.120.137]
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [ubuntu localhost] and IPs [192.168.120.137 127.0.0.1 ::1]
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

kindly help

@zillani
Copy link

zillani commented Mar 28, 2020

I was able to accomplish this by:

  • replacing the IP address in all config files in /etc/kubernetes
  • backing up /etc/kubernetes/pki
  • identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name[1]
  • deleting both the cert and key for each of them (for me it was just apiserver and etcd/peer)
  • regenerating the certs using kubeadm alpha phase certs[2]
  • identifying configmap in the kube-system namespace that referenced the old IP[3]
  • manually editing those configmaps
  • restarting kubelet and docker (to force all containers to be recreated)

[1]

/etc/kubernetes/pki# for f in $(find -name "*.crt"); do openssl x509 -in $f -text -noout > $f.txt; done
/etc/kubernetes/pki# grep -Rl 12\\.34\\.56\\.78 .
./apiserver.crt.txt
./etcd/peer.crt.txt
/etc/kubernetes/pki# for f in $(find -name "*.crt"); do rm $f.txt; done

[2]

/etc/kubernetes/pki# rm apiserver.crt apiserver.key
/etc/kubernetes/pki# kubeadm alpha phase certs apiserver
...
/etc/kubernetes/pki# rm etcd/peer.crt etcd/peer.key
/etc/kubernetes/pki# kubeadm alpha phase certs etcd-peer
...

[3]

$ kubectl -n kube-system get cm -o yaml | less
...
$ kubectl -n kube-system edit cm ...

Worked for me thanks

Only thing is you need to use

 kubeadm init phase ..

For latest kubectl versions

@akshaysharama
Copy link

akshaysharama commented May 13, 2020

@bboreham
I've followed the steps mentioned by @patricklucas
as you mentioned in step 4 need to do some configuration in /etc/hosts because IP has already changed and cannot connect to api-server.

Generate certificate using
kubeadm init --kubernetes-version=v1.16.3 phase certs apiserver

i've changed in /etc/hosts

and tried kubectl --server=https://:6443 still not working :(

any specific configuration need to do in /etc/hosts??

@tanguitar
Copy link

@patricklucas seriously, thank you for that write-up. It saved my life.

For those looking for even more clarity, here were my experiences:

  1. replace the IP address in all config files in /etc/kubernetes

    oldip=192.168.1.91
    newip=10.20.2.210
    cd /etc/kubernetes
    # see before
    find . -type f | xargs grep $oldip
    # modify files in place
    find . -type f | xargs sed -i "s/$oldip/$newip/"
    # see after
    find . -type f | xargs grep $newip
  2. backing up /etc/kubernetes/pki

    mkdir ~/k8s-old-pki
    cp -Rvf /etc/kubernetes/pki/* ~/k8s-old-pki
  3. identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name (this could be cleaned up)

    cd /etc/kubernetes/pki
    for f in $(find -name "*.crt"); do 
      openssl x509 -in $f -text -noout > $f.txt;
    done
    grep -Rl $oldip .
    for f in $(find -name "*.crt"); do rm $f.txt; done
  4. identify configmap in the kube-system namespace that referenced the old IP, edit them:

    # find all the config map names
    configmaps=$(kubectl -n kube-system get cm -o name | \
      awk '{print $1}' | \
      cut -d '/' -f 2)
    
    # fetch all for filename reference
    dir=$(mktemp -d)
    for cf in $configmaps; do
      kubectl -n kube-system get cm $cf -o yaml > $dir/$cf.yaml
    done
    
    # have grep help you find the files to edit, and where
    grep -Hn $dir/* -e $oldip
    
    # edit those files, in my case, grep only returned these two:
    kubectl -n kube-system edit cm kubeadm-config
    kubectl -n kube-system edit cm kube-proxy
  5. change the IP address (via cli or gui for your distro)

  6. delete both the cert and key for each identified by grep in the prior step, regenerate those certs

    NOTE: prior to recreating the certs via kubeadm admin phase certs ..., you'll need to have the new IP address applied

    rm apiserver.crt apiserver.key
    kubeadm alpha phase certs apiserver
    
    rm etcd/peer.crt etcd/peer.key
    kubeadm alpha phase certs etcd-peer
  7. restart kubelet and docker

    sudo systemctl restart kubelet
    sudo systemctl restart docker
  8. copy over the new config

    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config

@mariamTr ^

Thank you for your help, my problem was solved! Great greate job!

@SrihariRuttala
Copy link

I was able to accomplish this by:

  • replacing the IP address in all config files in /etc/kubernetes
  • backing up /etc/kubernetes/pki
  • identifying certs in /etc/kubernetes/pki that have the old IP address as an alt name[1]
  • deleting both the cert and key for each of them (for me it was just apiserver and etcd/peer)
  • regenerating the certs using kubeadm alpha phase certs[2]
  • identifying configmap in the kube-system namespace that referenced the old IP[3]
  • manually editing those configmaps
  • restarting kubelet and docker (to force all containers to be recreated)

[1]

/etc/kubernetes/pki# for f in $(find -name "*.crt"); do openssl x509 -in $f -text -noout > $f.txt; done
/etc/kubernetes/pki# grep -Rl 12\\.34\\.56\\.78 .
./apiserver.crt.txt
./etcd/peer.crt.txt
/etc/kubernetes/pki# for f in $(find -name "*.crt"); do rm $f.txt; done

[2]

/etc/kubernetes/pki# rm apiserver.crt apiserver.key
/etc/kubernetes/pki# kubeadm alpha phase certs apiserver
...
/etc/kubernetes/pki# rm etcd/peer.crt etcd/peer.key
/etc/kubernetes/pki# kubeadm alpha phase certs etcd-peer
...

[3]

$ kubectl -n kube-system get cm -o yaml | less
...
$ kubectl -n kube-system edit cm ...

If kubeadm alpha phase certs didn't work use kubeadm init phase certs
In my case:
with this command

/etc/kubernetes/pki# grep -Rl 12\\.34\\.56\\.78 

got following certs
./etcd/peer.crt.txt
./etcd/server.crt.txt
./apiserver.crt.txt
So we have used following commands to generate certs

kubeadm init phase certs apiserver
kubeadm init phase certs etcd-peer
kubeadm init phase certs etcd-server

Then exit from root mode and restart kubelet and docker/containerd

sudo systemctl status kubelet containerd
sudo systemctl start kubelet containerd

Then it is started working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests