Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm join fails when multiple CRI sockets exist #1495

Closed
mythi opened this issue Apr 9, 2019 · 16 comments · Fixed by kubernetes/kubernetes#76505
Closed

kubeadm join fails when multiple CRI sockets exist #1495

mythi opened this issue Apr 9, 2019 · 16 comments · Fixed by kubernetes/kubernetes#76505
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Milestone

Comments

@mythi
Copy link

mythi commented Apr 9, 2019

What keywords did you search in kubeadm issues before filing this one?

kubeadm-config

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"archive", BuildDate:"2019-03-29T16:29:07Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version): 1.14.0
  • Cloud provider or hardware configuration: baremetal
  • OS (e.g. from /etc/os-release): Clear Linux
  • Kernel (e.g. uname -a): 4.20
  • Others:

What happened?

On a node that has
/var/run/dockershim.sock, and /var/run/crio/crio.sock, kubeadm join --cri-socket /var/run/crio/crio.sock fails with an error

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: Found multiple CRI sockets, please use --cri-socket to select one: /var/run/dockershim.sock, /var/run/crio/crio.sock

Passing --cri-socket has no effect.

What you expected to happen?

Passing --cri-socket would resolve the conflict.

How to reproduce it (as minimally and precisely as possible)?

Have two CRI runtimes running (crio.sock and dockershim.sock present)

@rosti
Copy link

rosti commented Apr 9, 2019

Hi @mythi , thank you for your bug report!
I did try to replicate this issue, but I was not able to. Can you, please, provide us with the kubeadm join command you used. You can mask out any sensitive information (like the token) by replacing it with something meaningless.

@rosti
Copy link

rosti commented Apr 9, 2019

/assign

@rosti
Copy link

rosti commented Apr 9, 2019

/priority awaiting-more-evidence

@k8s-ci-robot k8s-ci-robot added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Apr 9, 2019
@mythi
Copy link
Author

mythi commented Apr 9, 2019

@rosti I tried both

 sudo kubeadm join <apiserver>:6443 --token <token>  --discovery-token-ca-cert-hash sha256:<hash> --cri-socket /var/run/crio/crio.sock

and

sudo kubeadm join --config kubeadm-config.yaml

with the information in kubeadm-config.yaml

@neolit123
Copy link
Member

neolit123 commented Apr 9, 2019

this is odd. are you sure you are running kubeadm 1.14?

@mythi
Copy link
Author

mythi commented Apr 10, 2019

@neolit123 both master and worker nodes have

kubeadm version: &version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"archive", BuildDate:"2019-03-29T16:29:07Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

@mythi
Copy link
Author

mythi commented Apr 10, 2019

One more data point: after disabling docker from the system everything works OK. This only triggers when both docker and CRI-O are running.

@fabriziopandini
Copy link
Member

/kind bug
/priority backlog
/milestone v1.15

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Apr 11, 2019
@neolit123
Copy link
Member

@rosti tested this in front of me and the bug is not reproducible.
there is something special about @mythi 's setup.

@fabriziopandini fabriziopandini added this to the v1.15 milestone Apr 11, 2019
@mythi
Copy link
Author

mythi commented Apr 11, 2019

I wonder what that might be. I have two colleagues who have seen this too and based on their reports I tried it out and I reproduced it right away. They're all on Clear Linux but there's nothing special there as far as I can tell.

@neolit123 @rosti did you also try with CRI-O and dockerd running?

@rosti
Copy link

rosti commented Apr 12, 2019

Yes, they were both running and their sockets were there and existing.
Besides this being Clear Linux, what is the Docker and CRI-O versions you are using?

@mythi
Copy link
Author

mythi commented Apr 12, 2019

Docker: Server Version: 18.06.2
CRI-O: crio version 1.13.1

@rosti
Copy link

rosti commented Apr 12, 2019

Ok, I just reproduced it. It's a genuine bug and it was a peculiarity of my setup that was masking it.
I am working on a fix now.

/remove-priority backlog
/remove-priority awaiting-more-evidence
/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Apr 12, 2019
@Docteur-RS
Copy link

Hey,

My kubespray is using kubeadm version v1.15.3 and crashes when executing the command: kubeadm init phase kubeconfig admin
It fails with the following error:
"Found multiple CRI sockets, please use --cri-socket to select one: /var/run/dockershim.sock, /var/run/crio/crio.sock"

If only I could simply add the --cri-socket flag on this command... Unfortunatly this flag and the init phase XXXX command is not compatible. Its telling me error: unknown flag: --cri-socket


Small recap:

kubeadm init => fails ❌
kubeadm init phase --cri-socket /var/run/crio/crio.sock => works ✔️
kubeadm init phase kubeconfig admin --cri-socket /var/run/crio/crio.sock => fails ❌
-> error: unknown flag: --cri-socket

@rosti Am I not using the command line API the correct way ?
thx ;-)

@neolit123
Copy link
Member

neolit123 commented Sep 13, 2019

this:

kubeadm init --cri-socket ...

should work:

if you want to use phases, have you tried passing a config (InitConfiguration):

nodeRegistration:
  criSocket: "/var/run/dockershim.sock"

see:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

?

@Docteur-RS
Copy link

Docteur-RS commented Sep 16, 2019

I finally got it !

The initial kubespray command was:
kubeadm init phase kubeconfig admin --kubeconfig-dir {{ kube_config_dir }}/external_kubeconfig

⚠️ It seems that the kubeconfig-dir flag was not taking into account the number of crio sockets. ⚠️

So I changed the line to:
kubeadm init phase kubeconfig admin --config /etc/kubernetes/kubeadm-config.yaml
as you told me to and it worked.

Thx !


For people having similar issues:

The config part that made it work on the master is the following:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 10.10.3.15
  bindPort: 6443
certificateKey: 9063a1ccc9c5e926e02f245c06b8d9f2ff3c1eb2dafe5fbe2595ab4ab2d3eb1a
nodeRegistration:
  name: p3kubemaster1
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  criSocket: /var/run/crio/crio.sock

In kubespray you must update the file roles/kubernetes/client/tasks/main.yml arround line 57.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants