Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K9s can't connect to cluster in logs but curl to cluster endpoint works #942

Closed
dgoradia opened this issue Nov 14, 2020 · 31 comments
Closed
Labels
question Further information is requested

Comments

@dgoradia
Copy link




Describe the bug
When opening up k9s to connect to a cluster, it fails with Boom!! K9s can't connect to cluster.

Logs show that a GET request to the cluster version endpoint timed out:

11:16PM INF 🐶 K9s starting up...
11:16PM ERR K9s can't connect to cluster error="Get \"https://xxxxxxxxxxxxx.gr1.us-west-2.eks.amazonaws.com/version?timeout=5s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
11:16PM PNC K9s can't connect to cluster
Log file created at: 2020/11/13 23:16:49
Running on machine: MyMachine
Binary: Built with gc go1.15.3 for darwin/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
11:16PM ERR Boom! K9s can't connect to cluster
11:16PM ERR goroutine 1 [running]:
runtime/debug.Stack(0x3f27660, 0x2d5f303, 0x0)
	runtime/debug/stack.go:24 +0x9f
github.com/derailed/k9s/cmd.run.func1()
	github.com/derailed/k9s/cmd/root.go:75 +0x125
panic(0x2a01b00, 0xc000376460)
	runtime/panic.go:969 +0x1b9
github.com/rs/zerolog.(*Logger).Panic.func1(0xc000b82800, 0x1c)
	github.com/rs/zerolog@v1.18.0/log.go:338 +0x4f
github.com/rs/zerolog.(*Event).msg(0xc00082ee40, 0xc000b82800, 0x1c)
	github.com/rs/zerolog@v1.18.0/event.go:146 +0x202
github.com/rs/zerolog.(*Event).Msgf(0xc00082ee40, 0x2d835b0, 0x1c, 0x0, 0x0, 0x0)
	github.com/rs/zerolog@v1.18.0/event.go:126 +0x87
github.com/derailed/k9s/cmd.loadConfiguration(0xc000abfbc8)
	github.com/derailed/k9s/cmd/root.go:130 +0x733
github.com/derailed/k9s/cmd.run(0x3f07620, 0xc0004694c0, 0x0, 0x2)
	github.com/derailed/k9s/cmd/root.go:83 +0x8d
github.com/spf13/cobra.(*Command).execute(0x3f07620, 0xc00004c0d0, 0x2, 0x2, 0x3f07620, 0xc00004c0d0)
	github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x3f07620, 0x0, 0x0, 0x0)
	github.com/spf13/cobra@v1.0.0/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.0.0/command.go:887
github.com/derailed/k9s/cmd.Execute()
	github.com/derailed/k9s/cmd/root.go:66 +0x2d
main.main()
	github.com/derailed/k9s/main.go:28 +0x1f8

To make sure my machine is able to obtain connectivity with that endpoint:

$ curl -k https://xxxxxxxxxxxxx.gr1.us-west-2.eks.amazonaws.com/version?timeout=5s
{
  "major": "1",
  "minor": "16+",
  "gitVersion": "v1.16.13-eks-2ba888",
  "gitCommit": "2ba888155c7f8093a1bc06e3336333fbdb27b3da",
  "gitTreeState": "clean",
  "buildDate": "2020-07-17T18:48:53Z",
  "goVersion": "go1.13.9",
  "compiler": "gc",
  "platform": "linux/amd64"
}

To Reproduce
Steps to reproduce the behavior:

  1. Try to connect to your EKS cluster with k9s

Expected behavior
Connect to cluster and k9s opens

Screenshots
Screen Shot 2020-11-13 at 11 27 39 PM

Versions (please complete the following information):

  • OS: MacOS
  • K9s:
Version:    v0.23.10
Commit:     a952806ebaa316e2c7d0949ad605fb4c944f2cd0
Date:       2020-11-10T15:21:22Z
  • K8s: 1.16 on EKS
@derailed derailed added the question Further information is requested label Nov 14, 2020
@derailed
Copy link
Owner

@dgoradia Hum... Can you actually connect to this cluster with kubectl? ie what does kubectl get no yields?
curl -k turns off certs and just validates the endpoint. Both kubectl and k9s use certs to connect to the api server. So wild guess here but your creds are not quiet setup here to connect to aws??

@dgoradia
Copy link
Author

Yes I can connect to the cluster with kubectl fine. The curl command did actually connect to the cluster and return the data and I think it's the same endpoint used by k9s to test reachability.

kubectl get no returns the nodes successfully.

@tuusberg
Copy link

I started getting that error on OSX after I upgraded to Big Sur. kubectl works for me, k9s says it can't connect to cluster.

@derailed
Copy link
Owner

@tuusberg @dgoradia Thank you both for reporting back! Weird... I haven't updated to Big Sur as of yet. I am not sure how this would actually impact k9s ability to connect to your clusters??
Could you please attach the k9s debug logs here so we can track down this issue? Tx!!

@ferllings
Copy link

ferllings commented Nov 24, 2020

I have the exact same issue.
For me the issue started before Big Sur.

I have 4 k3s clusters:

  • 2 externals (v1.19.3) that I can't connect
  • 1 external (v1.16.3) that I'm able to connect, but not all the time
  • and 1 on my local network (v1.17) that connects perfectly
    of course all 4 work fine with kubctl

Is it possible to be a latency issue? My Internet connection isn't very fast and that would explain why my local cluster connects fine

@derailed
Copy link
Owner

@ferllings Thank you for adding more details here. Every bits helps... Could you attach k9s debug logs so we can track this down? Tx!!

@dgoradia
Copy link
Author

@derailed it's been intermittent for me and hasn't happened in the last few days so I have been unable to capture the logs yet. As soon as it happens again I will add the logs here.

It may have to do with me having a poor connection as times and k9s possibly having a low timeout for a response on the check it does for connectivity, whereas kubectl takes longer to return results but returns then eventually.

I'm on macOS Catalina (10.15) and on kubernetes 1.16 (EKS)

@ferllings
Copy link

Yes it seems to be intermittent for me as well.

9:51AM INF 🐶 K9s starting up...
9:51AM ERR K9s can't connect to cluster error="Get \"https://xxx:6443/version?timeout=5s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
9:51AM PNC K9s can't connect to cluster
9:51AM ERR Boom! K9s can't connect to cluster
9:51AM ERR goroutine 1 [running]:
runtime/debug.Stack(0x3f12340, 0x2d49f03, 0x0)
	runtime/debug/stack.go:24 +0x9f
github.com/derailed/k9s/cmd.run.func1()
	github.com/derailed/k9s/cmd/root.go:75 +0x125
panic(0x29ec8c0, 0xc0002bc990)
	runtime/panic.go:969 +0x1b9
github.com/rs/zerolog.(*Logger).Panic.func1(0xc0005c3420, 0x1c)
	github.com/rs/zerolog@v1.20.0/log.go:345 +0x4f
github.com/rs/zerolog.(*Event).msg(0xc00035c4e0, 0xc0005c3420, 0x1c)
	github.com/rs/zerolog@v1.20.0/event.go:147 +0x302
github.com/rs/zerolog.(*Event).Msgf(0xc00035c4e0, 0x2d6e084, 0x1c, 0x0, 0x0, 0x0)
	github.com/rs/zerolog@v1.20.0/event.go:127 +0x87
github.com/derailed/k9s/cmd.loadConfiguration(0xc00091fbc8)
	github.com/derailed/k9s/cmd/root.go:130 +0x733
github.com/derailed/k9s/cmd.run(0x3ef2300, 0xc0004a6f60, 0x0, 0x2)
	github.com/derailed/k9s/cmd/root.go:83 +0x8d
github.com/spf13/cobra.(*Command).execute(0x3ef2300, 0xc0001120a0, 0x2, 0x2, 0x3ef2300, 0xc0001120a0)
	github.com/spf13/cobra@v1.1.1/command.go:854 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x3ef2300, 0x0, 0x0, 0x0)
	github.com/spf13/cobra@v1.1.1/command.go:958 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.1.1/command.go:895
github.com/derailed/k9s/cmd.Execute()
	github.com/derailed/k9s/cmd/root.go:66 +0x2d
main.main()
	github.com/derailed/k9s/main.go:28 +0x1f8

@bitva77
Copy link

bitva77 commented Nov 30, 2020

I'm seeing this same "issue" against an EKS cluster. kubectl works fine. K9s is timing out. If I set --request-timeout=20s than it works, albeit slowly.

@adelwin
Copy link

adelwin commented May 7, 2021

I'm also having the same issue right now.
I noticed yes this only happens for EKS, i've tried for AKS and KinD cluster.
If i provide the exact kubeconfig, it'll work, tho the same kubeconfig is taken implicitly by kubectl and kubectl can work

@derailed
Copy link
Owner

derailed commented May 7, 2021

@adelwin Thank you for pipping in. Which version of k9s are u using. Bumped the default timeout in the last drop. Helped??

@adelwin
Copy link

adelwin commented May 8, 2021

I think my issue is slightly different from the rest.
For the others, it seems that k9s is able to find the context, but fails to connect to its API server.
for me, seems that k9s is unable to even locate the context.
tho if I use kubectl without specifying the kubeconfig file, kubectl is able to find the context by itself.

 OSX  MBP  ~                                                                                                                                                                        ✔  9909  15:02:50
 adelwin  k9s version
 ____  __.________
|    |/ _/   __   \______
|      < \____    /  ___/
|    |  \   /    /\___ \
|____|__ \ /____//____  >
        \/            \/

Version:    0.24.7
Commit:     303de07663dcb20899852a98d3ebf6ce2f1c7922
Date:       n/a
12:13AM INF 🐶 K9s starting up...
12:13AM WRN Unable to fetch APIGroups error=Unauthorized
12:13AM ERR Checking metrics-server error=Unauthorized
12:13AM ERR failed to connect to cluster error=Unauthorized
12:13AM INF No context specific skin file found -- /Users/adelwin/.k9s/admin-eks-test_skin.yml
12:13AM INF No skin file found -- /Users/adelwin/.k9s/skin.yml. Loading stock skins.
12:13AM ERR Cluster metrics failed error="ACCESS -- No API server connection"
12:13AM ERR PreferredRES - No API server connection
12:13AM WRN Fail CRDs load error="ACCESS -- No API server connection"
12:13AM ERR Custom view load failed /Users/adelwin/.k9s/views.yml error="open /Users/adelwin/.k9s/views.yml: no such file or directory"
12:13AM ERR CustomView watcher failed error="lstat /Users/adelwin/.k9s/views.yml: no such file or directory"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR Cluster metrics failed error="ACCESS -- No API server connection"
12:13AM ERR PreferredRES - No API server connection
12:13AM WRN Fail CRDs load error="ACCESS -- No API server connection"
12:13AM WRN Unable to dial discovery API error="No connection to cached dial"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR ClusterUpdater failed error="Conn check failed (1/5)"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR ClusterUpdater failed error="Conn check failed (2/5)"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR ClusterUpdater failed error="Conn check failed (3/5)"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR ClusterUpdater failed error="Conn check failed (4/5)"
12:13AM ERR K9s can't connect to cluster error="the server has asked for the client to provide credentials"
12:13AM ERR Conn check failed (5/5). Bailing out!

@vladtkachuk
Copy link

Hey guys, I had a similar problem and accidentally found a cause and a solution for my case.

So, I am using oidc-login to connect to cluster, I also have multiple kubeconfig files for different contexts. The issue was that I had the same username but different credentials (client id/secret) for each context. This is likely to cause unexpected effects, especially when merging kubeconfig. Once I started using a unique username, k9s works perfectly, also combined with kubectx and kubens.

Here is a user name field in kubeconfig I am talking about:

apiVersion: v1
clusters: ...
contexts: ...
users:
- name: my-user
  user: ...

@Vazhnik
Copy link

Vazhnik commented May 6, 2022

Hi!
I faced a similar issue today. Got "Boom!! Cannot connect to cluster context-a." message instatly on the k9s command.

  • k9s - instant error
  • k9s --request-timeout="5s" - instant error
  • k9s --context context-a - works fine and I could switch to any of my contexts.

After exit k9s will remember my context but this does not change anything.

  • k9s -> Boom!! Cannot connect to cluster context-b

My system is macosCatalina 10.15.3.
k9s installed (and re-installed a few times) using brew install k9s

K9s is a great tool, hope this will be fixed :)

@rohithegde
Copy link

rohithegde commented May 22, 2022

I faced a similar error on my Mac (OS - Big Sur and Monterey) while connecting to AKS from k9s though kubectl worked fine.
It went away after upgrading my k9s from v0.24.14 to v0.25.18.

@Julien-Dosiere
Copy link

Julien-Dosiere commented Jun 18, 2022

I got the same issue. I tried many versions: it seems to work fine up to version v0.24.11, starting from v0.24.12 it can't connect to my clusters. I am on Ubuntu 20.04, using k8s v1.23.7 and have zero problem with kubectl.

@RomeroGaliza
Copy link

RomeroGaliza commented Jun 26, 2022

For those who might end up here for a similar reason:

Similar issue/behavior when running the installation from Snap on Ubuntu 22.04 against GKE (Google), with their gke-gcloud-auth-plugin plugin.

users:
- name: gke_my-cluster_europe-west1_example
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: gke-gcloud-auth-plugin

In this scenario both kubectl/kubectx work as expected.

When installing from LinuxBrew (brew install derailed/k9s/k9s) k9s worked just fine (with same .kube/config).

Running:

Version:    v0.25.18
Commit:     6085039f83cd5e8528c898cc1538f5b3287ce117
Date:       2021-12-28T16:53:21Z

@slimus
Copy link
Collaborator

slimus commented Jul 1, 2022

@RomeroGaliza @Julien-Dosiere hello! Could you please enable debug log level and share your k9s logs?
Run k9s with debug log: k9s -l debug
To find log path: k9s info
Thanks!

@madanrishi
Copy link

madanrishi commented Jul 5, 2022

I don't know if this helps, but I solved the issue where k9s was not working with minikube, k9s was working with AWS EKS
below is what I posted in k9s slack and is just a copy of how I solved it, hopefully the issue is similar.

So the issue was I had EKS clusters already configured and was loading the KUBECONFIGs using the command below in .zshrc
for f inls ~/.kube/config/do export KUBECONFIG="$HOME/.kube/config/$f:$KUBECONFIG"; done
so on every start of terminal, KUBECONFIG was set up with all file paths present in the folder.
if there were already configurations in the folder $HOME/.kube/config then it messes up the config that is generated when you run minikube start
How to resolve:

  1. copy the config folder and make a backup
  2. Delete all the files under config folder
  3. run minikube start . make sure if you ran it before to delete the docker container and volume, so that it starts fresh
  4. once minikube starts you should see a file in config folder(in my case this was named as one of the config files which was there before, it was not named minikube or config)
  5. rename the file if you want to have multiple configs and paste back the previous files from backup which were deleted.
    There might be an easier way, but above is how I fixed it.

@mnpenner
Copy link

mnpenner commented Jul 9, 2022

The issue I'm having is from my weird WSL setup. I got the logs as per #942 (comment) Read them with less -R /tmp/k9s-mpen.log because they contain colors. Press End to get to the bottom. My error was:

11:31AM ERR can't connect to cluster error="Get "https://***.k8s.ondigitalocean.com/version?timeout=10s": getting credentials: exec: executable C:\Users\Mark\bin\doctl.exe not found\n\nIt looks like you are trying to use a client-go credential plugin that is not installed.\n\nTo learn more about this feature, consult the documentation available at:\n https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins"

I thought maybe installing k9s under Windows instead might work but that's even worse; the UI is all mangled and it still doesn't connect.

kubectl is installed normally under WSL, so I don't know why k9s is having trouble.

I can connect to my Docker Desktop fine through k9s, it's just my prod cluster on DigitalOcean I can't connect to.

@jccmelo
Copy link

jccmelo commented Jul 13, 2022

After upgrading k9s (brew upgrade k9s) to the following version:

Version:    v0.25.21
Commit:     14862f3709dc8dda10f749b6415fce0178111a6d
Date:       2022-06-30T18:05:14Z

I was getting both failed to connect to context and a warning Kubeconfig user entry is using deprecated API version client.authentication.k8s.io/v1alpha1. Run 'aws eks update-kubeconfig' to update.

I have fixed them (on macOS) by using v1beta1 instead:

# switch to v1beta1 version
sed -i .bak -e 's/v1alpha1/v1beta1/' ~/.kube/config

@kurtextrem
Copy link

I have the same problem as @jccmelo and downgrading k9s via brew is not a trivial task at all. However, unlike for him, changing v1alpha1 to v1beta1 did not help.

@slimus
Copy link
Collaborator

slimus commented Jul 14, 2022

@kurtextrem please share your versions (k9s, kubectl, cluster, etc)

@kurtextrem
Copy link

In my case it was an outdated aws cli that caused the error (kube config did not update in that case). So if anyone is coming to this issue from Google, this might be another thing to take a look at.

@ku524
Copy link

ku524 commented Jul 19, 2022

My problem was same with @kurtextrem.

This is k9s log

�[90m5:46PM�[0m �[32mINF�[0m 🐶 K9s starting up...
�[90m5:46PM�[0m �[31mWRN�[0m Unable to dial discovery API �[36merror=�[0m�[31m"exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""�[0m
�[90m5:46PM�[0m �[1m�[31mERR�[0m�[0m Fail to locate metrics-server �[36merror=�[0m�[31m"exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""�[0m
�[90m5:46PM�[0m �[1m�[31mERR�[0m�[0m failed to connect to cluster �[36merror=�[0m�[31m"exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""�[0m

It seems that the new version doesn't allow apiVersion v1alpha1. So I changed its apiVersion in the kubeconfig to v1beta1, but it still doesn't work.
After I upgraded my aws-cli version to the latest, It works well.
So this problem for EKS seems to be a combination of k9s deprecating v1alpha1 and the old aws-cli which doesn't support v1beta1

@jbreiding
Copy link

I just had a similar problem where setting KUBECONFIG to multiple kubeconfig paths and kubectl worked, but k9s didn't work for all but one.

turns out I had a collision with the user name in my configs and the error from the k9s logs weren't obvious that it was an auth failure.

fixing that collision fixed my issue.

@allenbroad
Copy link

For those who are experiencing this problem with Google GKE clusters after updating to use the gke-gcloud-auth-plugin, I found the solution to the problem. It looks like the kube config information has to change for k9s to be able to connect to the contexts properly. For some reason kubectl works just fine but k9s couldn't talk to the cluster.

This is the change I made and my context was able to work again - edit (on Mac) ~/.kube/config

users:
- name: [GKE_CLUSTER_NAME] # Needs to match cluster name from "clusters" section in yaml
  # THIS WORKS
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args: null
      command: gke-gcloud-auth-plugin
      env: null
      installHint: Install gke-gcloud-auth-plugin for use with kubectl by following
        https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
      interactiveMode: IfAvailable
      provideClusterInfo: true

  # OLD FORMAT PRIOR TO gke-gcloud-auth-plugin - NO LONGER WORKS IN K9S
  # user:
  #   auth-provider:
  #     config:
  #       access-token: !QTQGFAKFADFK@#Q$L#@KJ
  #       cmd-args: config config-helper --format=json
  #       cmd-path: /Users/username/google-cloud-sdk/bin/gcloud
  #       expiry: "2023-02-03T14:45:31Z"
  #       expiry-key: '{.credential.token_expiry}'
  #       token-key: '{.credential.access_token}'
  #     name: gcp

As you can see above, the new gke-gcloud-auth-plugin approach is much more generic than it used to be. If you have multiple contexts you should be able to just copy and paste the user: block above and put it under each one of the - name: cluster configs.

This got my k9s back into a good state for me.

@tgulick
Copy link

tgulick commented Apr 15, 2023

Same issue "can't connect" to GKE clusters. New install on M1 Mac (13.2.1). k9s 0.27.3.
My .kube/config file has the new gke-cloud-auth-plugin format shown above.

kubectl commands work OK. I have 2 Linux machines that connect with no problem.

This MBP is my "corporate" Mac and has the Zscaler security tool installed. It's basically a MITM that inspects SSL traffic and is known to cause issues with cert's, etc. I did try connecting with the "insecure-tls" option but it didn't make a difference. My Linux machines do not have Zscaler installed.

4:44PM INF 🐶 K9s starting up...
4:44PM WRN Unable to dial discovery API error="invalid configuration: authProvider cannot be provided in combination with an exec plugin for *******************"
4:44PM ERR Fail to locate metrics-server error="invalid configuration: authProvider cannot be provided in combination with an exec plugin for ******************"
4:44PM ERR failed to connect to cluster "gke_ss-s2-core-p_us-central1-a_uat-01" error="invalid configuration: authProvider cannot be provided in combination with an exec plugin for *********************"

@tgulick
Copy link

tgulick commented Apr 18, 2023

The above issue is specific to one GKE cluster. I can connect to other clusters from my Mac without issues.

@anhptvolga
Copy link

anhptvolga commented May 19, 2023

I had the same problem. In my case, I use multiple kubeconf files. My mistake was that I use the same user in kubeconf files:

users:

  • name: kubernetes-admin
    user:
    client-certificate-data: ...
    client-key-data: ...

Therefore k9s can't use the correct cert to connect to cluster.

@derailed
Copy link
Owner

Looks like this issue stemmed from k8s lib update and the vendor auth changed. I think this is a dup of #1619.
Please reopen if this is not the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests