Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka scaler sometimes fails with no matching metrics found #2647

Closed
loicmathieu opened this issue Feb 16, 2022 · 6 comments
Closed

Kafka scaler sometimes fails with no matching metrics found #2647

loicmathieu opened this issue Feb 16, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@loicmathieu
Copy link

Report

I have a scaled object with 4 scalers: one prometheus, one cron and two Kafka on two different topics.
We have a lot of failures inside the events on the corresponding HPA like unable to fetch metrics from external metrics API: no matching metrics found for s3-kafka-private_aoaa_anomaly_v1

The same error appears in the Keda operator logs (see bellow).

When doing a describe on the corresponding scaled objects at the time of the issue, the metric is listed in the External Metric Names but there is no entry of it in the Health: section.

Describe of the scaled objet:

Name:         vcstream-gen-anomaly
Namespace:    vcstream-dev
Labels:       app.kubernetes.io/managed-by=Helm
              scaledobject.keda.sh/name=vcstream-gen-anomaly
Annotations:  meta.helm.sh/release-name: vcstream-gen-anomaly
              meta.helm.sh/release-namespace: vcstream-dev
API Version:  keda.sh/v1alpha1
Kind:         ScaledObject
Metadata:
  Creation Timestamp:  2022-01-10T15:19:18Z
  Finalizers:
    finalizer.keda.sh
  Generation:  6
    Manager:         keda-adapter
    Operation:       Update
    Time:            2022-02-16T07:41:36Z
  Resource Version:  335673688
  UID:               d24419d0-0360-4ab4-b9dc-a2b1f161da1c
Spec:
  Advanced:
    Restore To Original Replica Count:  true
  Cooldown Period:                      60
  Fallback:
    Failure Threshold:  3
    Replicas:           1
  Max Replica Count:    6
  Min Replica Count:    1
  Scale Target Ref:
    Name:  vcstream-gen-anomaly
  Triggers:
    Metadata:
      Desired Replicas:  1
      End:               59 18 * * *
      Start:             0 8 * * *
      Timezone:          Europe/Brussels
    Type:                cron
    Metadata:
      Metric Name:     process_cpu_usage
      Query:           avg(process_cpu_usage{component="vcstream-gen-anomaly", namespace="vcstream-dev"} * 100) > 5 OR on() vector(0)
      Server Address:  http://prometheus-server.monitoring.svc.cluster.local
      Threshold:       75
    Type:              prometheus
    Authentication Ref:
      Name:  keda-trigger-auth-kafka-credential
    Metadata:
      Bootstrap Servers:    decathlon-aoaa-vcstream-dev-01-peered-aoaa.aivencloud.com:12658
      Consumer Group:       vcstream-gen-anomaly
      Lag Threshold:        100
      Offset Reset Policy:  latest
      Topic:                private_aoaa_anomaly_v1
    Type:                   kafka
    Authentication Ref:
      Name:  keda-trigger-auth-kafka-credential
    Metadata:
      Bootstrap Servers:    decathlon-aoaa-vcstream-dev-01-peered-aoaa.aivencloud.com:12658
      Consumer Group:       vcstream-gen-anomaly
      Lag Threshold:        100
      Offset Reset Policy:  latest
      Topic:                private_aoaa_deadletter_v1
    Type:                   kafka
Status:
  Conditions:
    Message:  ScaledObject is defined correctly and is ready for scaling
    Reason:   ScaledObjectReady
    Status:   True
    Type:     Ready
    Message:  Scaling is performed because triggers are active
    Reason:   ScalerActive
    Status:   True
    Type:     Active
    Message:  No fallbacks are active on this scaled object
    Reason:   NoFallbackFound
    Status:   False
    Type:     Fallback
  External Metric Names:
    s0-cron-Europe-Brussels-08xxx-5918xxx
    s1-prometheus-process_cpu_usage
    s3-kafka-private_aoaa_anomaly_v1
    s3-kafka-private_aoaa_deadletter_v1
  Health:
    s0-cron-europe-brussels-08xxx-5918xxx:
      Number Of Failures:  0
      Status:              Happy
    s1-prometheus-process_cpu_usage:
      Number Of Failures:  0
      Status:              Happy
    s3-kafka-private_aoaa_deadletter_v1:
      Number Of Failures:  0
      Status:              Happy
  Last Active Time:        2022-02-16T09:12:16Z
  Original Replica Count:  1
  Scale Target GVKR:
    Group:            apps
    Kind:             Deployment
    Resource:         deployments
    Version:          v1
  Scale Target Kind:  apps/v1.Deployment
Events:               <none>

Describe of the HPA:

Name:                                                              keda-hpa-vcstream-gen-anomaly
Namespace:                                                         vcstream-dev
Labels:                                                            app.kubernetes.io/managed-by=Helm
                                                                   app.kubernetes.io/name=keda-hpa-vcstream-gen-anomaly
                                                                   app.kubernetes.io/part-of=vcstream-gen-anomaly
                                                                   app.kubernetes.io/version=2.6.0
                                                                   scaledobject.keda.sh/name=vcstream-gen-anomaly
Annotations:                                                       <none>
CreationTimestamp:                                                 Mon, 10 Jan 2022 16:19:18 +0100
Reference:                                                         Deployment/vcstream-gen-anomaly
Metrics:                                                           ( current / target )
  "s0-cron-Europe-Brussels-08xxx-5918xxx" (target average value):  1 / 1
  "s1-prometheus-process_cpu_usage" (target average value):        0 / 75
  "s3-kafka-private_aoaa_anomaly_v1" (target average value):       <unknown> / 100
  "s3-kafka-private_aoaa_deadletter_v1" (target average value):    0 / 100
Min replicas:                                                      1
Max replicas:                                                      6
Deployment pods:                                                   1 current / 1 desired
Conditions:
  Type            Status  Reason                   Message
  ----            ------  ------                   -------
  AbleToScale     True    ReadyForNewScale         recommended size matches current size
  ScalingActive   False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric vcstream-dev/s3-kafka-private_aoaa_anomaly_v1/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: vcstream-gen-anomaly,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: no matching metrics found for s3-kafka-private_aoaa_anomaly_v1
  ScalingLimited  False   DesiredWithinRange       the desired count is within the acceptable range
Events:
  Type     Reason                   Age                    From                       Message
  ----     ------                   ----                   ----                       -------
  Warning  FailedGetExternalMetric  109s (x7439 over 46h)  horizontal-pod-autoscaler  unable to get external metric vcstream-dev/s3-kafka-private_aoaa_anomaly_v1/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: vcstream-gen-anomaly,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: no matching metrics found for s3-kafka-private_aoaa_anomaly_v1

Expected Behavior

No failure event on the HPA

Actual Behavior

Failure event on the HPA and the metric is show as <unknown>.

Steps to Reproduce the Problem

No idea how to reproduce it.

Logs from KEDA operator

E0216 09:14:02.481279       1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"no matching metrics found for s3-kafka-private_aoaa_anomaly_v1"}: no matching metrics found for s3-kafka-private_aoaa_anomaly_v1

KEDA Version

2.6.0

Kubernetes Version

1.19

Platform

Google Cloud

Scaler Details

prometheus, kafka, cron

Anything else?

No response

@loicmathieu loicmathieu added the bug Something isn't working label Feb 16, 2022
@tomkerkhove tomkerkhove moved this to Proposed in Roadmap - KEDA Core Feb 16, 2022
@zroubalik
Copy link
Member

You have probably hit this issue: #2592

I recommend you to update to 2.6.1

@loicmathieu
Copy link
Author

@zroubalik upgrading it now ;), thanks for the info.

I'll give feedback after upgrade and a few hours to see if the failures still happened.

@loicmathieu
Copy link
Author

@zroubalik so after a few hours running with 2.6.1 I confirm that this issue is fixed ;)

I'm closing it now.

@johnnytardin
Copy link

In version 2.7.1 it still happening.

@joaopuccini
Copy link

@zroubalik, i'm still with this problem also.

my keda version: 2.7.1

I created scaledObjet with kafka consumer.


$ kubectl describe hpa keda-hpa-ccs-relationship-worker-metrics --namespace ccs-dev
...
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric ccs-dev/s0-kafka-alias-bank-account-events/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: ccs-relationship-worker-metrics,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: the server could not find the requested resource (get s0-kafka-alias-bank-account-events.external.metrics.k8s.io)
Events:
  Type     Reason                   Age                   From                       Message
  ----     ------                   ----                  ----                       -------
  Warning  FailedGetExternalMetric  43s (x4078 over 17h)  horizontal-pod-autoscaler  unable to get external metric ccs-dev/s0-kafka-alias-bank-account-events/&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: ccs-relationship-worker-metrics,},MatchExpressions:[]LabelSelectorRequirement{},}: unable to fetch metrics from external metrics API: the server could not find the requested resource (get s0-kafka-alias-bank-account-events.external.metrics.k8s.io)

@joaopuccini
Copy link

Boa tarde @johnnytardin, voce conseguiu algo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

4 participants