Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to scrape metrics on the keda metrics server #4776

Closed
Tracked by #4795
GDegrove opened this issue Jul 5, 2023 · 1 comment · Fixed by #4766
Closed
Tracked by #4795

Not able to scrape metrics on the keda metrics server #4776

GDegrove opened this issue Jul 5, 2023 · 1 comment · Fixed by #4766
Labels
bug Something isn't working

Comments

@GDegrove
Copy link

GDegrove commented Jul 5, 2023

Report

When updating our keda operator from version 2.10.2 to version 2.11.0, we noticed what we think is a bug in keda metric server.

When using the Keda Metric server version 2.10.2, we see two endpoints that can serve the /metrics path.
By default, we see the port 9022 to be scraped that contain metrics such has: keda_metrics_adapter_scaled_object_errors.

We also see the port 8080 to be available also serving metrics such as controller_runtime_reconcile_time_seconds_bucket.

When upgrading to version 2.11.0, we see that the current Prometheus metric server under the port 9022, disappear (expected) but we are not able to scrape metrics available to on port 8080.

In Prometheus, we see actually an error Error 502 Bad Gateway

When doing port-forward to port 8080 on version 2.11.x we also see this error:

E0705 13:26:35.024534   17036 portforward.go:234] lost connection to pod
Handling connection for 8080
E0705 13:26:35.024896   17036 portforward.go:346] error creating error stream for port 8080 -> 8080: EOF

Expected Behavior

The default service monitor for metric server works as expected without having to change anything

Actual Behavior

The service monitor does not work as expected and the port 8080 cannot be scraped by prometheus

Steps to Reproduce the Problem

  1. deploy keda on EKS version 1.26+
  2. enabling scraping on metricServer

Logs from KEDA operator

I0705 11:22:27.572160       1 welcome.go:34] keda_metrics_adapter "msg"="Starting metrics server"
I0705 11:22:27.572254       1 welcome.go:35] keda_metrics_adapter "msg"="KEDA Version: 2.11.1"
I0705 11:22:27.572265       1 welcome.go:36] keda_metrics_adapter "msg"="Git Commit: b8dbd298cf9001b1597a2756fd0be4fa4df2059f"
I0705 11:22:27.572275       1 welcome.go:37] keda_metrics_adapter "msg"="Go Version: go1.20.5"
I0705 11:22:27.572285       1 welcome.go:38] keda_metrics_adapter "msg"="Go OS/Arch: linux/amd64"
I0705 11:22:27.572310       1 welcome.go:39] keda_metrics_adapter "msg"="Running on Kubernetes 1.26+" "version"={"major":"1","minor":"26+","gitVersion":"v1.26.5-eks-c12679a","gitCommit":"c03cecf98904742cce2e1183f87194102cc9dad9","gitTreeState":"clean","buildDate":"2023-05-22T20:29:55Z","goVersion":"go1.19.9","compiler":"gc","platform":"linux/amd64"}
I0705 11:22:27.573109       1 listener.go:44] keda_metrics_adapter/controller-runtime/metrics "msg"="Metrics server is starting to listen" "addr"=":8080"
I0705 11:22:27.574611       1 main.go:148] keda_metrics_adapter "msg"="Connecting Metrics Service gRPC client to the server" "address"="keda-operator.pls-keda.svc.cluster.local:9666"
I0705 11:22:27.600106       1 provider.go:65] keda_metrics_adapter/provider "msg"="starting"
I0705 11:22:27.600132       1 main.go:239] keda_metrics_adapter "msg"="starting adapter..."
I0705 11:22:27.601287       1 client.go:88] keda_metrics_adapter/provider "msg"="Waiting for establishing a gRPC connection to KEDA Metrics Server"
I0705 11:22:28.102288       1 provider.go:73] keda_metrics_adapter/provider "msg"="Connection to KEDA Metrics Service gRPC server has been successfully established" "server"="keda-operator.pls-keda.svc.cluster.local:9666"
I0705 11:22:28.318303       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0705 11:22:28.318334       1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I0705 11:22:28.318507       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0705 11:22:28.318540       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0705 11:22:28.318979       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/certs/tls.crt::/certs/tls.key"
I0705 11:22:28.319965       1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/certs/ca.crt"
I0705 11:22:28.320529       1 secure_serving.go:210] Serving securely on [::]:6443
I0705 11:22:28.320604       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0705 11:22:28.419179       1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
I0705 11:22:28.419179       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

KEDA Version

2.11.1

Kubernetes Version

1.26

Platform

Amazon Web Services

Scaler Details

No response

Anything else?

No response

@GDegrove GDegrove added the bug Something isn't working label Jul 5, 2023
@JorTurFer
Copy link
Member

Hi!
Thanks for reporting, we removed them accidentally during some changes.
In v2.9 we deprecated metrics server Promethean metrics in favor of operator metrics and that's why we removed the e2e tests for those metrics (not detecting this problem until the release).
I'm fixing it because there are some metrics useful there about how the metrics server itself is working, but for KEDA observability, I suggest using the operator metrics because metrics server doesn't expose any more metrics about how KEDA is working.

You can check available metrics for each component here: https://keda.sh/docs/2.11/operate/prometheus/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants