Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Sync fails due to metrics.k8s.io discovery error #1991

Closed
hamid2013 opened this issue Apr 29, 2019 · 5 comments · Fixed by #2009
Closed

Sync fails due to metrics.k8s.io discovery error #1991

hamid2013 opened this issue Apr 29, 2019 · 5 comments · Fixed by #2009
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed bug onboarding/activation Particular pertinence to getting Flux up and running

Comments

@hamid2013
Copy link

hamid2013 commented Apr 29, 2019

Hello All,

While following the steps, i am getting the below error on the flux and not able to run any workload using the flux:

ts=2019-04-29T04:06:36.714622818Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-29T04:06:43.293711087Z caller=loop.go:103 component=sync-loop event=refreshed url=git@github.com:xxxxxxx/xxxxxx branch=master HEAD=c38943df8dde8b4ffda98a050e6f643ec712ccc3
ts=2019-04-29T04:06:43.528694435Z caller=main.go:199 type="internal kubernetes error" err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-29T04:06:43.528789635Z caller=loop.go:210 component=sync-loop err="collating resources in cluster for sync: not found"
ts=2019-04-29T04:06:43.531861734Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: not found"
ts=2019-04-29T04:11:36.714944593Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-29T04:11:36.818370548Z caller=main.go:199 type="internal kubernetes error" err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-29T04:11:44.168625836Z caller=loop.go:210 component=sync-loop err="collating resources in cluster for sync: not found"
ts=2019-04-29T04:11:44.171171938Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: not found"
ts=2019-04-29T04:11:44.172498339Z caller=loop.go:103 component=sync-loop event=refreshed url=git@github.com:xxxxxx/xxxxxxxx branch=master HEAD=c38943df8dde8b4ffda98a050e6f643ec712ccc3
ts=2019-04-29T04:16:36.825719462Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-29T04:16:36.939251772Z caller=main.go:199 type="internal kubernetes error" err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-29T04:16:45.041430658Z caller=loop.go:210 component=sync-loop err="collating resources in cluster for sync: not found"
ts=2019-04-29T04:16:45.04355066Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: not found"```

Any help will be appreciated.
@lkosz
Copy link

lkosz commented Apr 30, 2019

The same problem with:

  • GKE, kubernetes v. 1.12.7-gke.10, clean, just deployed cluster
  • helm v2.12.3
  • flux 1.12.1, chart v0.9.2

Logs from flux pod:

ts=2019-04-30T15:14:14.388856059Z caller=main.go:192 version=1.12.1
ts=2019-04-30T15:14:14.414804737Z caller=main.go:346 component=cluster identity=/etc/fluxd/ssh/identity
ts=2019-04-30T15:14:14.414853262Z caller=main.go:347 component=cluster identity.pub="ssh-rsa ..." 
ts=2019-04-30T15:14:14.414886312Z caller=main.go:348 component=cluster host=https://10.2.0.1:443 version=kubernetes-v1.12.7-gke.10
ts=2019-04-30T15:14:14.414944773Z caller=main.go:360 component=cluster kubectl=/usr/local/bin/kubectl
ts=2019-04-30T15:14:14.416181456Z caller=main.go:371 component=cluster ping=true
ts=2019-04-30T15:14:14.424369715Z caller=main.go:504 url=git@....git user="Weave Flux" email=support@weave.works signing-key= sync-tag=flux-sync notes-ref=flux set-author=false
ts=2019-04-30T15:14:14.424441003Z caller=main.go:561 upstream="no upstream URL given"
ts=2019-04-30T15:14:14.425109781Z caller=main.go:582 addr=:3030
ts=2019-04-30T15:14:14.425400867Z caller=loop.go:90 component=sync-loop err="git repo not ready: git repo has not been cloned yet"
ts=2019-04-30T15:14:14.42543519Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:14.425447813Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:14.99186012Z caller=checkpoint.go:24 component=checkpoint msg="up to date" latest=1.12.1
ts=2019-04-30T15:14:15.699468879Z caller=warming.go:198 component=warmer info="refreshing image" image=docker.io/weaveworks/helm-operator tag_count=15 to_update=15 of_which_refresh=0 of_which_missing=15
ts=2019-04-30T15:14:16.520717253Z caller=warming.go:206 component=warmer updated=docker.io/weaveworks/helm-operator successful=15 attempted=15
ts=2019-04-30T15:14:16.521310683Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:16.521344059Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:17.300027383Z caller=warming.go:198 component=warmer info="refreshing image" image=memcached tag_count=68 to_update=68 of_which_refresh=0 of_which_missing=68
ts=2019-04-30T15:14:19.367301081Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:14:19.632424736Z caller=warming.go:206 component=warmer updated=memcached successful=68 attempted=68
ts=2019-04-30T15:14:19.721616109Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:14:19.721627185Z caller=memcache.go:199 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:14:19.723726274Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:14:19.723736762Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:14:19.741070634Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:14:19.74116762Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:19.81937953Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:21.022331691Z caller=warming.go:198 component=warmer info="refreshing image" image=docker.io/weaveworks/flux tag_count=29 to_update=29 of_which_refresh=0 of_which_missing=29
ts=2019-04-30T15:14:22.406835225Z caller=warming.go:206 component=warmer updated=docker.io/weaveworks/flux successful=29 attempted=29
ts=2019-04-30T15:14:22.407079934Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:22.491887236Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:22.816378428Z caller=warming.go:198 component=warmer info="refreshing image" image=gcr.io/stackdriver-agents/stackdriver-logging-agent tag_count=43 to_update=43 of_which_refresh=0 of_which_missing=43
ts=2019-04-30T15:14:23.990321458Z caller=warming.go:206 component=warmer updated=gcr.io/stackdriver-agents/stackdriver-logging-agent successful=43 attempted=43
ts=2019-04-30T15:14:23.990709606Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:24.071599116Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:24.466000245Z caller=warming.go:198 component=warmer info="refreshing image" image=gcr.io/kubernetes-helm/tiller tag_count=69 to_update=69 of_which_refresh=0 of_which_missing=69
ts=2019-04-30T15:14:25.848811677Z caller=warming.go:206 component=warmer updated=gcr.io/kubernetes-helm/tiller successful=69 attempted=69
ts=2019-04-30T15:14:25.849972253Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:14:25.924965686Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:14:49.48935677Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:14:49.817122766Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:14:49.817139134Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:14:49.836981413Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:20.25409899Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:15:20.254113633Z caller=memcache.go:199 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:20.25589743Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:15:20.255907374Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:20.274468081Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:20.276593296Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:15:25.925306668Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:15:26.006435102Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:15:51.06012156Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:15:51.06013407Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:51.078705207Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:15:51.08636588Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:16:21.752093002Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:16:21.752106478Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:16:21.768521245Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:16:21.770565476Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:16:26.006781485Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:16:26.089713607Z caller=images.go:28 component=sync-loop msg="no automated workloads"
ts=2019-04-30T15:16:52.873453573Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:16:52.873468656Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:16:52.890825119Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:16:52.893905855Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:17:23.504796994Z caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:17:23.504810062Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:17:23.520032947Z caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
ts=2019-04-30T15:17:23.522067982Z caller=loop.go:103 component=sync-loop event=refreshed url=git@....git branch=master HEAD=29c7067c2a2ec05ffc663e2e040a6984518bf4a3
ts=2019-04-30T15:17:26.090019339Z caller=images.go:18 component=sync-loop msg="polling images"
ts=2019-04-30T15:17:26.171956062Z caller=images.go:28 component=sync-loop msg="no automated workloads"

@hiddeco hiddeco added blocked-needs-validation Issue is waiting to be validated before we can proceed bug labels Apr 30, 2019
@hiddeco
Copy link
Member

hiddeco commented Apr 30, 2019

Thanks both of you for your reports, we will be looking into this asap.

@hamid2013 can you share some details about your environment?

@hiddeco hiddeco added the onboarding/activation Particular pertinence to getting Flux up and running label Apr 30, 2019
@ttarczynski
Copy link
Contributor

This seems a bit similar to: #1951
As the two error messages in logs from @lkosz are:

caller=main.go:206 type="internal kubernetes error" ts=2019-04-30T15:16:52.873468656Z caller=memcache.go:111 err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
caller=loop.go:90 component=sync-loop err="collating resources in cluster for sync: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"

@2opremio
Copy link
Contributor

This is similar to #1951 (which is fixed in Flux1.12.1). However, in this case Flux is not silencing the error (the server is currently unable to handle the request when trying to get the resources for metrics.k8s.io/v1beta1) since it's legitimate.

After a quick search ( see kubernetes-sigs/metrics-server#157 , kubernetes-sigs/prometheus-adapter#66 and for instance ), this seems to be a misconfiguration of the metrics addon (or something else leveraging the metrics API) in your clusters.

To fix this, Flux could potentially avoid getting the resources for *metrics.k8s.io/* (I have never used metrics, but I don't think it makes any sense to sync them with Flux) but for now I think it's best if you try to address the metrics API problem in your clusters.

Since this has come up before, it would help us to understand how this happened . @hamid2013 @lkosz could you answer the following questions?

  1. @hamid2013 what kubernetes version are you using and how did you deploy it?
  2. @hamid2013 @lkosz Could you show me the output of kubectl get --raw /apis/metrics.k8s.io/v1beta1 (I would expect something like server is currently unable to handle the request as an output)
  3. @hamid2013 @lkosz Are you knowingly using the metrics API? Did you modify the metrics API configuration in any way? Are you running the prometheus adapter or similar?

@ttarczynski
Copy link
Contributor

And one more hint.
There was another bug related to GKE: #1855 which also mentioned couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
And there was a hint that it might depend on timing (e.g. deploying both the GKE cluster and flux within a single terraform run): #1855 (comment)

@2opremio 2opremio changed the title Not able to run the flux successfully Sync fails due to metrics.k8s.io discovery error May 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed bug onboarding/activation Particular pertinence to getting Flux up and running
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants