Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digital Ocean implementation of K8's does not allow metrics-server to function #150

Closed
benjamin-maynard opened this issue Nov 27, 2018 · 49 comments

Comments

@benjamin-maynard
Copy link

benjamin-maynard commented Nov 27, 2018

Hello,

I raised an issue last night for this, but since I've had more time to sleep, wanted to raise it again and provide some more information.

I currently have a Digital Ocean Managed Kubernetes Cluster. I have some applications deployed and running on it.

I have configured Horizontal Pod Autoscaling for one of my deployments, but when running the kubectl get hpa command, I noticed the following in my output ( in the targets column):

NAME                              REFERENCE                                    TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
benjamin-maynard-io-fe            Deployment/benjamin-maynard-io-fe            <unknown>/80%   1         20        3          10h

I identified this was because I did not have either heapster or metrics-server running on my cluster. So went to install it as per the instructions on https://github.com/kubernetes-incubator/metrics-server

metrics-server successfully installs, and is running in the kube-system namespace:

NAME                                READY   STATUS    RESTARTS   AGE
csi-do-controller-0                 3/3     Running   0          42h
csi-do-node-dbvg5                   2/2     Running   0          42h
csi-do-node-lq97x                   2/2     Running   1          42h
csi-do-node-mvnrw                   2/2     Running   0          42h
kube-dns-55cf9576c4-4r466           3/3     Running   0          42h
kube-proxy-upbeat-lichterman-3mz4   1/1     Running   0          42h
kube-proxy-upbeat-lichterman-3mzh   1/1     Running   0          42h
kube-proxy-upbeat-lichterman-3mzi   1/1     Running   0          42h
metrics-server-7fbd9b8589-64x86     1/1     Running   0          9m48s

However, I am still getting no metrics.

Running kubectl get apiservice v1beta1.metrics.k8s.io -o yaml reveals:

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  creationTimestamp: 2018-11-27T08:32:26Z
  name: v1beta1.metrics.k8s.io
  resourceVersion: "396557"
  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  uid: f88f3576-f21e-11e8-8aed-fab39051e242
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100
status:
  conditions:
  - lastTransitionTime: 2018-11-27T08:32:26Z
    message: 'no response from https://10.245.219.253:443: Get https://10.245.219.253:443:
      net/http: request canceled while waiting for connection (Client.Timeout exceeded
      while awaiting headers)'
    reason: FailedDiscoveryCheck
    status: "False"
    type: Available

With the latter part of the output being of interest message: 'no response from https://10.245.219.253:443: Get https://10.245.219.253:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)'

I believe the above error message means that the kube-apiserver cannot speak to the metrics-server service. I believe this is due to the specifics of how the Digital Ocean Kubernetes master works.

I've performed some other general validation:

Service is configured:

Benjamins-MacBook-Pro:metrics-server benmaynard$ kubectl get service --namespace=kube-system
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
kube-dns         ClusterIP   10.245.0.10      <none>        53/UDP,53/TCP   2d19h
metrics-server   ClusterIP   10.245.219.253   <none>        443/TCP         14m

metrics-server is up and running:

Benjamins-MacBook-Pro:metrics-server benmaynard$ kubectl logs metrics-server-7fbd9b8589-64x86 --namespace=kube-system
I1127 08:32:30.665197       1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
[restful] 2018/11/27 08:32:32 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi
[restful] 2018/11/27 08:32:32 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/
I1127 08:32:32.981732       1 serve.go:96] Serving securely on [::]:443

Another customer has reported similar things: https://www.digitalocean.com/community/questions/cannot-get-kubernetes-horizonal-pod-autoscaler-or-metrics-server-working

@benjamin-maynard
Copy link
Author

Additional troubleshooting:

sudo kubectl port-forward metrics-server-7fbd9b8589-64x86 443 --namespace=kube-system

image

Looks like metrics-server is functioning fine ^

Endpoints & Pod IP's below:

Benjamins-MacBook-Pro:Terraform benmaynard$ kubectl get endpoints --namespace=kube-system
NAME                      ENDPOINTS                       AGE
kube-controller-manager   <none>                          2d20h
kube-dns                  10.244.18.7:53,10.244.18.7:53   2d20h
kube-scheduler            <none>                          2d20h
metrics-server            10.244.18.3:443                 61m
Benjamins-MacBook-Pro:Terraform benmaynard$ kubectl describe pod metrics-server-7fbd9b8589-64x86 --namespace=kube-system
Name:               metrics-server-7fbd9b8589-64x86
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               upbeat-lichterman-3mzi/10.131.95.94
Start Time:         Tue, 27 Nov 2018 08:32:27 +0000
Labels:             k8s-app=metrics-server
                    pod-template-hash=7fbd9b8589
Annotations:        <none>
Status:             Running
IP:                 10.244.18.3
Controlled By:      ReplicaSet/metrics-server-7fbd9b8589
Containers:
  metrics-server:
    Container ID:  docker://f6a9a48851ec7201cf79eea23995e12e4db567f14a1d9341a65ca9cca1038cd4
    Image:         k8s.gcr.io/metrics-server-amd64:v0.3.1
    Image ID:      docker-pullable://gcr.io/google_containers/metrics-server-amd64@sha256:78938f933822856f443e6827fe5b37d6cc2f74ae888ac8b33d06fdbe5f8c658b
    Port:          <none>
    Host Port:     <none>
    Command:
      /metrics-server
      --kubelet-insecure-tls
      --kubelet-preferred-address-types=InternalIP
    State:          Running
      Started:      Tue, 27 Nov 2018 08:32:29 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp from tmp-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-59tfj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  tmp-dir:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  metrics-server-token-59tfj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  metrics-server-token-59tfj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

@benjamin-maynard
Copy link
Author

Hey @andrewsykim - wondering if you'd be able to let me know if the above is a Digital Ocean bespoke issue, or if I am doing wrong. If it is just me I will go away and figure out why, but I think it might be a Digital Ocean thing.

Thanks a lot

@andrewsykim
Copy link
Contributor

Hey @benjamin-maynard, sorry for the delay! Let me dig further and get back to you :)

@benjamin-maynard
Copy link
Author

Hey @andrewsykim - Was just wondering if you managed to dig anything up on this? I've performed loads of debugging since and can't see anything wrong with my implementation.

@cbenhagen
Copy link

Looks like the --readonly-port=10255 option is not set.

Linking another related question: https://www.digitalocean.com/community/questions/cannot-install-heapster-to-cluster-due-to-kubelets-not-allowing-to-get-metrics-on-port-10255

@benjamin-maynard
Copy link
Author

@cbenhagen Looks like there is potentially a couple issues at play!

Tried to get an update from Digital Ocean Support too, they've said they've managed to replicate it in some instances, but also have got it to work too. Sadly no ETA 😢.

Have asked them how they managed to get it to work as a pretty important component missing. Hopefully @andrewsykim can help too. Don't want to move to GKE!

@andrewsykim
Copy link
Contributor

Sorry folks, we're working through this, it's been a bit busy with KubeCON coming up. Hoping to have more details soon (by next week maybe?) :)

@cbenhagen
Copy link

@andrewsykim can you reproduce the issue?

@LewisTheobald
Copy link

We've also ran into the same problem:

unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)

The logs from metrics-server:

I1217 15:58:20.863236       1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
[restful] 2018/12/17 15:58:21 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi
[restful] 2018/12/17 15:58:21 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/
I1217 15:58:21.258099       1 serve.go:96] Serving securely on [::]:443
E1217 16:15:21.349397       1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:admiring-khayyam-3sat: [unable to get CPU for container "app" in pod default/app-577554689b-dpch6 on node "10.133.45.98", discarding data: missing cpu usage metric, unable to get CPU for container "app" in pod default/app-577554689b-lqbk5 on node "10.133.45.98", discarding data: missing cpu usage metric], unable to fully scrape metrics from source kubelet_summary:admiring-khayyam-3sal: unable to get CPU for container "app" in pod default/app-577554689b-pdmrl on node "10.133.82.219", discarding data: missing cpu usage metric]
E1217 20:45:40.605436       1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:admiring-khayyam-3sat: unable to fetch metrics from Kubelet admiring-khayyam-3sat (10.133.45.98): request failed - "401 Unauthorized", response: "Unauthorized"

This is with the recommendations of some other reports online stating to run with the additional commands kubelet-insecure-tls and kubelet-preferred-address-types=InternalIP but this still gave the same.

Lewiss-iMac:www lewis$ kubectl describe pod metrics-server-7fbd9b8589-dg548 --namespace=kube-system
Name:               metrics-server-7fbd9b8589-dg548
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               sleepy-liskov-3sgb/10.133.82.251
Start Time:         Mon, 17 Dec 2018 15:58:12 +0000
Labels:             k8s-app=metrics-server
                    pod-template-hash=7fbd9b8589
Annotations:        <none>
Status:             Running
IP:                 10.244.37.6
Controlled By:      ReplicaSet/metrics-server-7fbd9b8589
Containers:
  metrics-server:
    Container ID:  docker://a0d6e215f5cac2a4f4c0309a4f4fd32e37099587736a9e7c1ccf1bd7b6751875
    Image:         k8s.gcr.io/metrics-server-amd64:v0.3.1
    Image ID:      docker-pullable://k8s.gcr.io/metrics-server-amd64@sha256:78938f933822856f443e6827fe5b37d6cc2f74ae888ac8b33d06fdbe5f8c658b
    Port:          <none>
    Host Port:     <none>
    Command:
      /metrics-server
      --kubelet-insecure-tls
      --kubelet-preferred-address-types=InternalIP
    State:          Running
      Started:      Mon, 17 Dec 2018 15:58:19 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp from tmp-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-nk2st (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  tmp-dir:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  metrics-server-token-nk2st:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  metrics-server-token-nk2st
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

@cbenhagen
Copy link

The following works for me now:
helm install --name metrics stable/metrics-server --namespace kube-system -f values.yaml

values.yaml:

args:
  - --logtostderr
  - --kubelet-preferred-address-types=InternalIP
  - --kubelet-insecure-tls

@dwdraju
Copy link

dwdraju commented Dec 29, 2018

Adding these vars on deployment file of metrics-server works.

      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        imagePullPolicy: Always
        command:
        - /metrics-server
        - --logtostderr
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-insecure-tls

        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp

@cherednichenkoa
Copy link

Hello everyone,

I am having the same issue with metric beat that related to ELK stack,

`error making http request: Get http://localhost:10255/stats/summary: dial tcp 127.0.0.1:10255: getsockopt: connection refused

`

Have anyone got a clear answer from DO about it ?

@clenn
Copy link

clenn commented Jan 7, 2019

Adding these vars on deployment file of metrics-server works.

      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        imagePullPolicy: Always
        command:
        - /metrics-server
        - --logtostderr
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-insecure-tls

        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp

kubectl get apiservice v1beta1.metrics.k8s.io -o yaml still gives me the same type of error like @benjamin-maynard :

no response from https://10.245.81.14:443: Get https://10.245.81.14:443:
      net/http: request canceled while waiting for connection (Client.Timeout exceeded
      while awaiting headers)

@andrewsykim any updates?

@timoreimann
Copy link
Contributor

timoreimann commented Jan 7, 2019

I just ran tests with all versions of DOKS currently supported (1.11.5-do.2, 1.12.3-do.2, and 1.13.1-do.2 as of this writing). In each case, I was able to read metrics properly.

Here's what I did (mostly summarizing what was mentioned before in this issue):

  1. Apply the 1.8+ manifests of metrics-server, adding the extra commands suggested by @dwdraju above
  2. Run an HPA-managed deployment of Nginx like this: kubectl run --image=nginx nginx --requests=cpu=200m && kubectl autoscale deploy nginx --min=1 --max=10 --cpu-percent=80 (please take note that you need to specify CPU requests for HPA to take action)
  3. Wait for metrics to be collected and finally see the target numbers in kubectl get hpa nginx populated (or kubectl top node for a more basic check unrelated to HPA)

The waiting part in the last step is important: It takes 1-2 minutes for metrics to show up.

FWIW, I created my Kubernetes clusters in the FRA1 region.

Is anyone not able to reproduce a successful setup with my steps outlined above?

@clenn
Copy link

clenn commented Jan 8, 2019

@timoreimann i was running 1.12.3-do.1 i've upgraded to 1.13.1-do.2 and now it works as expected.
Thanks

@LewisTheobald
Copy link

@timoreimann i was running 1.12.3-do.1 i've upgraded to 1.13.1-do.2 and now it works as expected.
Thanks

I can confirm this as well. Deleted my original problematic cluster and re-created from scratch. Metrics are working as they should (with the modification to the metrics-server commands).

@timoreimann
Copy link
Contributor

Thanks for the feedback, appeciated. 👍

@andrewsykim looks like we can close this issue.

@andrewsykim
Copy link
Contributor

awesome, thanks everyone!

@arussellsaw
Copy link

hey @andrewsykim i'm running 1.13.1-do.2 and seeing this issue for prometheus scraping kubelet metrics, should i open a new issue?

@LewisTheobald
Copy link

Hey @arussellsaw we're also experiencing the same problem with Prometheus. We abandoned setting up Prometheus temporarily until the issue is resolved.

The issue came from the kubelet config, Prometheus required the mode as Webhook as mentioned here.

"authorization": {
  "mode": "AlwaysAllow",
  "webhook": {
    "cacheAuthorizedTTL": "5m0s",
    "cacheUnauthorizedTTL": "30s"
  }
}

The error we received was: dial tcp 10255: connect: connection refused. And when switching the kube-server kubelet controller to http-metrics it returned server returned HTTP status 401 Unauthorized.

@andrewsykim
Copy link
Contributor

cc @fatih @nanzhong

@timoreimann
Copy link
Contributor

@arussellsaw @LewisTheobald have you possibly tried the steps I outlined / pointed at above? They have proven to be working, at least for me.

@arussellsaw
Copy link

@timoreimann i get a healthy output from kubectl get apiservice v1beta1.metrics.k8s.io, but no metrics in the dashboard, and prometheus is still failing to scrape kubelet metrics

@arussellsaw
Copy link

@timoreimann ok so it's just prometheus-operator-kubelet metrics that aren't working now, the error resposne i get is Get http://10.131.57.106:10255/metrics/cadvisor: dial tcp 10.131.57.106:10255: connect: connection refused

@zedtux
Copy link

zedtux commented Feb 11, 2019

Facing exactly the same issue.

From the prometheus targets (http://localhost:9090/targets) I can see all is green excepted for the monitoring/prometheus-operator-kubelet/0 and monitoring/prometheus-operator-kubelet/1 failing with the error :

Get http://<node private IP address>:10255/metrics: dial tcp <node private IP address>:10255: connect: connection refused

And the monitoring/prometheus-operator-node-exporter/0 is also red because of context deadline exceeded when fetching data from http://<node private ip address>:9100/metrics for 2 nodes, the third one is green.

@orzarchi
Copy link

Experiencing the exact same issue

@arussellsaw
Copy link

@andrewsykim should we reopen this issue, or perhaps create a new one?

@timoreimann
Copy link
Contributor

@arussellsaw Andrew is not working on DO's CCM anymore, I'm taking over.

I have a hunch it's not related to CCM (anymore) but rather a configuration or DOKS issue, so I'd say let's refrain from opening a new issue for now. I will keep this on my radar and ask internally what the current status on the matter; will circle back as soon as I have an answer.

Thanks.

@zedtux
Copy link

zedtux commented Feb 18, 2019

Do you mind (to the guy who will open this new issue) to link it to this one please ?

(Well ... what's the point of opening a new issue to report the same issue? 🤷‍♂️)

@benetis
Copy link

benetis commented Feb 26, 2019

Experiencing the same issue, can we reopen this? (prometheus metrics)

@m3dwards
Copy link

m3dwards commented Mar 6, 2019

I just ran tests with all versions of DOKS currently supported (1.11.5-do.2, 1.12.3-do.2, and 1.13.1-do.2 as of this writing). In each case, I was able to read metrics properly.

Here's what I did (mostly summarizing what was mentioned before in this issue):

  1. Apply the 1.8+ manifests of metrics-server, adding the extra commands suggested by @dwdraju above
  2. Run an HPA-managed deployment of Nginx like this: kubectl run --image=nginx nginx --requests=cpu=200m && kubectl autoscale deploy nginx --min=1 --max=10 --cpu-percent=80 (please take note that you need to specify CPU requests for HPA to take action)
  3. Wait for metrics to be collected and finally see the target numbers in kubectl get hpa nginx populated (or kubectl top node for a more basic check unrelated to HPA)

The waiting part in the last step is important: It takes 1-2 minutes for metrics to show up.

FWIW, I created my Kubernetes clusters in the FRA1 region.

Is anyone not able to reproduce a successful setup with my steps outlined above?

@timoreimann can you explain why metrics-server needs to be run with address type as IP and in insecure mode? Insecure mode surely isn't a value we want running in production clusters? I'm hazy on why metrics-server doesn't work out of the box.

@timoreimann
Copy link
Contributor

@maxwedwards I agree with you that running on insecure mode is probably not a good idea for production loads. My goal at the time was to validate which parts of the metrics-server integration worked and which didn't. Previous DOKS image versions had issues that got fixed at some point, but it's possible there's still something missing that prevents metrics-server from functioning properly in a desirable configuration. I haven't had time to further investigate the problem myself.

However, I'm not convinced it is something that CCM is responsible for or can help fixing. That's why I'm hesitant to reopen the issue and give the (false) impression that CCM maintainers could work on a fix. (If anyone has a lead that CCM is, in fact, part of the problem or solution, I'd be more than happy to hear and talk about it.) I encourage affected users to submit a question at DO's Q&A platform and/or file a support ticket at DO so that we can get more people to look at the problem.

@m3dwards
Copy link

m3dwards commented Mar 7, 2019

@timoreimann my naive understanding of why it works is because dns using the node's name isn't working on the cluster. We switch to IP addressing to get round the fact that DNS is not working correctly but then the certificates on the cluster are set up with hostnames and not with ip addresses so we then need to switch off checking tls certs.

Is DNS within the cluster not part of CCM?

A cluster with basic metrics on CPU and Memory usage, set up correctly, has to be part of minimum offering on a hosted k8s service? Imagine an Ubuntu image where top or ps doesn't work without adding insecure hacks? I want to run my business on this!

Not getting at you personally, just think it shouldn't be glossed over.

@timoreimann
Copy link
Contributor

@maxwedwards totally agree that basic metrics collection in a secure way is a must-have. Apologies if this came across us "we don't care about this", we genuinely do. What I was trying to express is that CCM is likely not the place where the fix should (maybe even can) happen: the project's primary purpose is to implement the cloud provider interface that is defined by upstream Kubernetes. I checked again but don't see a way to hook into host name resolutions. We could presumably hack it into the project, but it might not be the right place to do so.

Regardless, I have filed an internal ticket to track progress on the matter and did some initial investigations myself that confirm your findings. WiIl keep you posted in this issue since it has become a place to go to for most users that ran into the problem.

By the way, we also have something else in the pipeline at DO to improve on the observability front which we intend to release not too far in the future. It only touches partially on the subject discussed here though, proper metrics-server integration still is a given to support features that built on top of it (like autoscaling).

Sorry again in case I have sent the wrong message. Appreciate the feedback!

@ssprasad100
Copy link

ssprasad100 commented Mar 31, 2019

here is the solution for it, you have to use --kubelet-insecure-tls and --kubelet-preferred-address-types=InternalIP for connection refused issue.

here is my yml of metrics server (metrics-server-deployment.yaml)


apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
imagePullPolicy: Always
command:
- /metrics-server
- --logtostderr
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls

    volumeMounts:
    - name: tmp-dir
      mountPath: /tmp

@esin
Copy link

esin commented May 13, 2019

@ssprasad100 hey! can you add your yaml as gist?

@Simwar
Copy link

Simwar commented Jul 1, 2019

Hi @timoreimann

This issue has been closed but I believe this is still not fixed, as the solution proposed is to NOT use TLS: --kubelet-insecure-tls
Is there another issue we can track for this?

Thanks

@timoreimann
Copy link
Contributor

@Simwar and others: we just opened up a new repository to track more general feature requests and bug reports related to DOKS (but not specific to any other of our repos, like this one). I created digitalocean/DOKS#2 to address the issue around metrics-server not supported with TLS on DOKS.

Please continue discussions on the new issue. Thanks!

@m-usmanayub
Copy link

facing the same issue only for pods on newly created 1.19.3 cluster... tried every single possibility mentioned above but still no success. any other suggestions?

@timoreimann
Copy link
Contributor

@m-usmanayub sounds like you are affected by kubernetes/kubernetes#94281. We're in the process of shipping a 1.19 update that comes with Docker 19.03 where the problem is apparently fixed.

@m-usmanayub
Copy link

@m-usmanayub sounds like you are affected by kubernetes/kubernetes#94281. We're in the process of shipping a 1.19 update that comes with Docker 19.03 where the problem is apparently fixed.

Sounds great. Thanks for the update and reference link

@kyranb
Copy link

kyranb commented Nov 4, 2020

What version will this be @timoreimann ? 1.19.4-do.1 ?

@timoreimann
Copy link
Contributor

@kyranb it should be 1.19.3-do.2. I'll post again once the release is out.

@timoreimann
Copy link
Contributor

1.19.3-do.2 was just released and should fix the problem. Please report back if that's not the case.

Sorry for the inconveniences!

@WyriHaximus
Copy link

@timoreimann just started the upgrade and can confirm that this is now fixed

@timoreimann
Copy link
Contributor

@WyriHaximus thanks for confirming! 💙

@WyriHaximus
Copy link

@timoreimann you're welcome, checked my clusters status page a few minutes before you posted and started the upgrade. Was kinda hoping to beat you to it, but more glad that this is fix 🎉

@nielsonsantana
Copy link

nielsonsantana commented Apr 10, 2021

I'm had the same problem. Here as I solved by using metrics-server chart from bitname repository.

helm repo add bitnami https://charts.bitnami.com/bitnami
helm template metrics-server bitnami/metrics-server --values metrics-server.yaml -n kube-system
#metrics-server.yaml
apiService:
  create: true # this solves the permission problem

extraArgs:
  kubelet-preferred-address-types: InternalIP

@Nuxij
Copy link

Nuxij commented Jul 19, 2022

Hi this is still biting me on 1,22. I have to change the endpoints to be InternalIP and enable the apiService

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests