-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to scrape metrics through kubeletstats receiver #26481
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@anand3493 the receiver is unable to scrape metrics from |
@TylerHelmuth true that but am using the default configuration for the endpoint: ${K8S_NODE_NAME}:10250 and I can confirm the host name is a valid one as I see them through kubectl get nodes command. |
Are you able to hit the endpoint successfully? |
We're having this issue as well, even on version 0.85: 2023-09-14T08:12:11.091Z error kubeletstatsreceiver@v0.85.0/scraper.go:68
call to /stats/summary endpoint failed
{"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get \"https://<#####>:10250/stats/summary\": dial tcp: lookup <#####> on 100.64.0.10:53: no such host"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/kubeletstatsreceiver.(*kubletScraper).scrape We basically followed the getting-started installation process, enabling the kubeletMetrics preset and providing a custom otel endpoint. However, we seem to have found a workaround by using the node's hostIP. According to downward-api/#available-fields the field
In order to apply the workaround we performed these steps:
[...]
env:
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
[...]
[...]
config:
receivers:
kubeletstats:
collection_interval: 20s
auth_type: 'serviceAccount'
endpoint: 'https://${env:NODE_IP}:10250'
[...]
helm upgrade otel-collector --values values.yaml <path-to-cloned-repo>/charts/opentelemetry-collector/ Maybe this is not the right thing to do, but nevertheless it might point in the right direction. |
@sspieker if node name is not working for you then node IP is a valid workaround. You don't have to modify the helm chart tho, it supports added extra env vars: mode: daemonset
presets:
kubeletMetrics:
enabled: true
extraEnvs:
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
config:
receivers:
kubeletstats:
endpoint: 'https://${env:NODE_IP}:10250' |
As it happens, that works too. Thanks @TylerHelmuth , this makes stuff quite a bit easier for us! |
@TylerHelmuth This NODE_IP suggestion worked for me as well. The us-east-1 based nodes has the private IP DNS Name in the format: ip-xx-xxx-xxx-xx.ec2.internal The eu-west-1 based nodes has the private IP DNS Name in the format: ip-xx-xxx-xxx-xxx.eu-west-1.compute.internal This may be the reason why NODE_NAME is not working on my European Cluster. NODE_IP works fine. . |
2025-01-23T06:48:38.004Z error scraperhelper/scrapercontroller.go:197 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://mahesh:10250/stats/summary\": dial tcp: lookup mahesh on 10.96.0.10:53: no such host", "scraper": "kubeletstats"} I used to get this error message when I used K8S_NODE_NAME instead of K8S_NODE_IP then I was this issue on github. I saw that template is already changed and K8S_NODE_IP can be used directly instead of cloning the repo on my server. The below provided error message this is message I received when I changed the configmap for the collector (changed the endpoint to endpoint: ${env:K8S_NODE_IP}:10250) 2025-01-23T06:50:04.952Z error scraperhelper/scrapercontroller.go:197 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://<MY_NODE_IP>:10250/stats/summary": tls: failed to verify certificate: x509: cannot validate certificate for <MY_NODE_IP> because it doesn't contain any IP SANs", "scraper": "kubeletstats"} after changing the NODE_NAME to NODE_IP, I am getting this error in my collector logs. Also it worked for me before some days but now it is not working, I am not able to scrape metrics and logs from my cluster and send it to my endpoint. Can somebody suggest any workaround this issue or what is going wrong and how to fix this. |
Component(s)
receiver/kubeletstats
What happened?
Description
Getting error from Opentelemetry Collector agent pods -
scraperhelper/scrapercontroller.go:200 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://ip-10-166-222-111.eu-west-1.compute.internal:10250/stats/summary\": dial tcp: lookup ip-10-166-222-111.eu-west-1.compute.internal on 172.20.0.10:53: no such host", "scraper": "kubeletstats"}
The nodes are present without doubt. Happening over all the pods of the collector daemonset.
Steps to Reproduce
Expected Result
To scrape metrics and send to the exporter
Actual Result
Erroring out at
scraperhelper/scrapercontroller.go:200 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://ip-10-166-222-111.eu-west-1.compute.internal:10250/stats/summary\": dial tcp: lookup ip-10-166-222-111.eu-west-1.compute.internal on 172.20.0.10:53: no such host", "scraper": "kubeletstats"}
Collector version
v0.83.0
Environment information
Environment
Kubernetes
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: