Multiple Prometheus Scalers in same scaled object return same metric values #702

ycabrer · 2020-03-26T03:01:28Z

Hi,

I am using two prometheus scalers on the same scaled object to scale a single deployment based on different queries.

Each metric has a distinct name and query yet both query results are returned on each metric. This is causing strange metrics in the resulting HPA.

Expected Behavior

Each distinct metric / query in a scaled object should only return the data it is requesting

Actual Behavior

Each distinct metric / query is returning all metrics for the all queries.

Steps to Reproduce the Problem

Create a scaled object with two prometheus scalers.

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: multiple-queue-consumer
  namespace: qa
  labels:
    deploymentName: multiple-queue-consumer
spec:
  scaleTargetRef:
    deploymentName: multiple-queue-consumer
  pollingInterval: 30
  cooldownPeriod:  150
  minReplicaCount: 1
  maxReplicaCount: 16
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local
      metricName:  mq_messages_per_consumer_queue_1
      threshold: '6'
      query: >
        rabbitmq_queue_messages{queue="queue-1"}
        / 
        rabbitmq_queue_consumers{queue="queue-1"}
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local
      metricName: mq_messages_per_consumer_queue_2
      threshold: '6'
      query: >
        rabbitmq_queue_messages{queue="queue-2"} 
        / 
        rabbitmq_queue_consumers{queue="queue-2""}

Use the kubectl to verify external metric values
get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/mq_messages_per_consumer_queue_1?labelSelector=deploymentName%3Dmultiple-queue-consumer" | jq

result for queue 1:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/mq_messages_per_consumer_queue_1"
  },
  "items": [
    {
      "metricName": "mq_messages_per_consumer_queue_1",
      "metricLabels": null,
      "timestamp": "2020-03-25T22:17:24Z",
      "value": "21"
    },
    {
      "metricName": "mq_messages_per_consumer_queue_1",
      "metricLabels": null,
      "timestamp": "2020-03-25T22:17:24Z",
      "value": "8"
    }
  ]
}

get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/mq_messages_per_consumer_queue_1?labelSelector=deploymentName%3Dmultiple-queue-consumer" | jq
result for queue 2:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/mq_messages_per_consumer_queue_2"
  },
  "items": [
    {
      "metricName": "mq_messages_per_consumer_queue_2",
      "metricLabels": null,
      "timestamp": "2020-03-25T22:17:24Z",
      "value": "21"
    },
    {
      "metricName": "mq_messages_per_consumer_queue_2",
      "metricLabels": null,
      "timestamp": "2020-03-25T22:17:24Z",
      "value": "8"
    }
  ]
}

The first value of 21 is the metric for queue-1 and the second is the metric for queue-2 in each case

HPA

Name:                                            keda-hpa-multiple-queue-consumer
Namespace:                                       qa
Labels:                                          app.kubernetes.io/managed-by=keda-operator
                                                 app.kubernetes.io/name=keda-hpa-multiple-queue-consumer
                                                 app.kubernetes.io/part-of=multiple-queue-consumer
                                                 app.kubernetes.io/version=1.3.0
Annotations:                                     <none>
CreationTimestamp:                               Wed, 25 Mar 2020 14:51:46 -0600
Reference:                                       Deployment/multiple-queue-consumer
Metrics:                                         ( current / target )
  "mq_messages_per_consumer_queue_1" (target average value):  14500m / 6
  "mq_messages_per_consumer_queue_2" (target average value):   14500m / 6
Min replicas:                                    1
Max replicas:                                    16
Deployment pods:                                 4 current / 4 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from external metric mq_messages_per_consumer_queue_1(&LabelSelector{MatchLabels:map[stringstring{deploymentName: multiple-queue-consumer,},MatchExpressions:,})
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                        Age                  From                       Message
  ----     ------                        ----                 ----                       -------
  Normal   SuccessfulRescale             46m                  horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal   SuccessfulRescale             30m                  horizontal-pod-autoscaler  New size: 3; reason: external metric mq_messages_per_consumer_queue_1(&LabelSelector{MatchLabels:map[stringstring{deploymentName: multiple-queue-consumer,},MatchExpressions:,}) above target
  Normal   SuccessfulRescale             13m (x7 over 109m)   horizontal-pod-autoscaler  New size: 3; reason: All metrics below target
  Normal   SuccessfulRescale             85s (x7 over 87m)    horizontal-pod-autoscaler  New size: 4; reason: external metric mq_messages_per_consumer_queue_1(&LabelSelector{MatchLabels:map[stringstring{deploymentName: multiple-queue-consumer,},MatchExpressions:,}) above target

Scaled Object

Name:         multiple-queue-consumer
Namespace:    qa
Labels:       deploymentName=multiple-queue-consumer
Annotations:  
API Version:  keda.k8s.io/v1alpha1
Kind:         ScaledObject
Metadata:
  Creation Timestamp:  2020-03-25T15:39:31Z
  Finalizers:
    finalizer.keda.k8s.io
  Generation:        8
  Resource Version:  18647747
  Self Link:         /apis/keda.k8s.io/v1alpha1/namespaces/qa/scaledobjects/multiple-queue-consumer
  UID:               d1e01b0d-6eae-11ea-bf96-0ed83b31ac69
Spec:
  Cooldown Period:    150
  Max Replica Count:  16
  Min Replica Count:  1
  Polling Interval:   30
  Scale Target Ref:
    Deployment Name:  multiple-queue-consumer
  Triggers:
    Metadata:
      Metric Name:  mq_messages_per_consumer_queue_1
      Query:        rabbitmq_queue_messages{queue="queue-1"} /  rabbitmq_queue_consumers{queue="queue-1"}

      Server Address:  http://prometheus-server.monitoring.svc.cluster.local
      Threshold:       6
    Type:              prometheus
    Metadata:
      Metric Name:  mq_messages_per_consumer_queue_2
      Query:        rabbitmq_queue_messages{queue="queue-2"} /  rabbitmq_queue_consumers{queue="queue-2"}

      Server Address:  http://prometheus-server.monitoring.svc.cluster.local
      Threshold:       6
    Type:              prometheus
Status:
  External Metric Names:
    mq_messages_per_consumer_queue_1
    mq_messages_per_consumer_queue_2
  Last Active Time:  2020-03-26T02:54:32Z
Events:              <none>

Specifications

KEDA Version: 1.3.0
Platform & Version: EKS 1.14
Kubernetes Version: 1.14
Scaler(s): Prometheus

The text was updated successfully, but these errors were encountered:

tomkerkhove · 2020-03-27T10:05:48Z

Thanks for reporting!

zroubalik · 2020-03-31T08:23:19Z

@ycabrer thanks for reporting. I have found the potential problem in the Metrics Server implementation. This code is returning all metrics for each ScaledObject.

We have two options how to solve it:

Quick fix: In Metrics Server in the for loop mentioned above, after getting all metrics for a ScaledObject, filter and select only those we are interested in. It will be a simple change on the Metrics Server side. I don't like this one, cause we will be querying scalers we are not interested in.
Proper fix: Because we are not able to get used metrics for the scaler from the code, we will need to add GetMetricName() to Scaler interface, implement this method across all scalers and then use this method in the Metrics Server to distiguish the Scaler we want to query for the metrics. This will result in querying only the scaler we want to konw it's metrics.

ycabrer · 2020-04-01T15:24:07Z

@zroubalik I see GetMetricSpecForScaling is used to construct the HPA spec. Couldn't the metric name be extracted from there?

ycabrer · 2020-04-04T09:46:26Z

@zroubalik

This seems to be working,

for _, scaler := range scalers {
	metricSpecs := scaler.GetMetricSpecForScaling()

	for _, metricSpec := range metricSpecs {
		// Filter only the desired metric
		if metricSpec.External.MetricName == info.Metric {
			metrics, err := scaler.GetMetrics(context.TODO(), info.Metric, metricSelector)
			if err != nil {
				logger.Error(err, "error getting metric for scaler", "ScaledObject.Namespace", scaledObject.Namespace, "ScaledObject.Name", scaledObject.Name, "Scaler", scaler)
			} else {
				matchingMetrics = append(matchingMetrics, metrics...)
			}
		}
	}

	scaler.Close()
}

if len(matchingMetrics) <= 0 {
	return nil, fmt.Errorf("Error no matching metrics found")
}

provider.go

This should be fine I think since all metrics seem to be external metrics.

I'm not sure if iterating through metricSpecs is needed but the return type of GetMetricSpecForScaling seems to be an array.

Let me know what you think. I can open a pull request.

Signed-off-by: zhak <zhak@kimonocloud.com> Co-authored-by: zhak <zhak@kimonocloud.com>

This reverts commit 47afebc.

zroubalik · 2020-04-07T09:41:48Z

@ycabrer unfortunately I was forced to revert your commit. There are 2 problems:

The metrics adapter doesn't work as expected, I have tested it with azure-queue and kafka scalers, and for these scaler Metrics server doesn't return any metric, err msg is: No matching metrics found. We need to double check, whether the problem is in the scalers implementation (ie. GetMetricSpecForScaling() is not implemented correctly) or the problem is in your change.
The commit wasn't signed properly, so it breaks DCO check for any other PRs applied after your commit was merged in the master. See the message:

//EDIT: added Kafka scaler note

ycabrer · 2020-04-07T19:20:20Z

@zroubalik I was able to recreate the issue,

It looks like it is an issue with the casing when comparing the metric name with the scaler.

We are getting lagThreshold (camoCase) from the metric spec and lagthreshold (lowercase) from info.Metric

It looks like the kubernetes client changes it to lowercase as explained kubernetes issue # 72996

ycabrer · 2020-04-07T19:30:39Z

camoCase:

~/ops !1 ❯ kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/lagThreshold/?labelSelector=deploymentName%3Ddashboard" | jq 
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/lagThreshold/"
  },
  "items": [
    {
      "metricName": "lagThreshold",
      "metricLabels": null,
      "timestamp": "2020-04-07T19:26:21Z",
      "value": "5"
    }
  ]
}

lowercase:

~/ops !1 ❯ kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/qa/lagthreshold/?labelSelector=deploymentName%3Ddashboard" | jq
Error from server: No matching metrics found

I think another issue is that the metric name is hard coded to lagThreshold. This would cause the same issue with multiple metrics being retrieved. Ideally it would have a unique name. Maybe based on the kafka topic or the broker.

zroubalik · 2020-04-08T09:25:21Z

@ycabrer excellent investigation. Thanks!

I think another issue is that the metric name is hard coded to lagThreshold. This would cause the same issue with multiple metrics being retrieved. Ideally it would have a unique name. Maybe based on the kafka topic or the broker.

Yeah, that is true, each scaler has hardcoded metric name, therefore multiple scalers with the same metric in one ScaledObject will not work, we are aware of that. We can try to find solution how to fix this as it might be a good idea for upcoming v2.0 release.

…ffect rate limiting (kedacore#702) Signed-off-by: Ara Pulido <ara.pulido@datadoghq.com>

ycabrer added the bug Something isn't working label Mar 26, 2020

ycabrer changed the title ~~Multiple Prometheus Scalers return duplicate metrics~~ Multiple Prometheus Scalers in same scaled object return same metric values Mar 27, 2020

This was referenced Apr 4, 2020

Return only desired metrics from metric server #726

Closed

Return only desired metrics from metric server #727

Merged

zroubalik closed this as completed in #727 Apr 6, 2020

zroubalik pushed a commit that referenced this issue Apr 6, 2020

Only return metrics matching MetricName. Issue #702 (#727)

47afebc

Signed-off-by: zhak <zhak@kimonocloud.com> Co-authored-by: zhak <zhak@kimonocloud.com>

zroubalik added a commit that referenced this issue Apr 7, 2020

Revert "Only return metrics matching MetricName. Issue #702 (#727)"

94adae3

This reverts commit 47afebc.

zroubalik reopened this Apr 7, 2020

ycabrer mentioned this issue Apr 8, 2020

Filter returned metrics from api server based on queried name. Issue # 702 #732

Merged

3 tasks

zroubalik mentioned this issue Apr 8, 2020

[Proposal] Use unique name for metricName for each scaler #733

Closed

zroubalik closed this as completed in #732 Apr 8, 2020

SpiritZhou pushed a commit to SpiritZhou/keda that referenced this issue Jul 18, 2023

Datadog: add more information related to polling intervals that may a…

2dd30e2

…ffect rate limiting (kedacore#702) Signed-off-by: Ara Pulido <ara.pulido@datadoghq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Prometheus Scalers in same scaled object return same metric values #702

Multiple Prometheus Scalers in same scaled object return same metric values #702

ycabrer commented Mar 26, 2020 •

edited

Loading

tomkerkhove commented Mar 27, 2020

zroubalik commented Mar 31, 2020 •

edited

Loading

ycabrer commented Apr 1, 2020 •

edited

Loading

ycabrer commented Apr 4, 2020 •

edited

Loading

zroubalik commented Apr 7, 2020 •

edited

Loading

ycabrer commented Apr 7, 2020 •

edited

Loading

ycabrer commented Apr 7, 2020

zroubalik commented Apr 8, 2020 •

edited

Loading

Multiple Prometheus Scalers in same scaled object return same metric values #702

Multiple Prometheus Scalers in same scaled object return same metric values #702

Comments

ycabrer commented Mar 26, 2020 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

HPA

Scaled Object

Specifications

tomkerkhove commented Mar 27, 2020

zroubalik commented Mar 31, 2020 • edited Loading

ycabrer commented Apr 1, 2020 • edited Loading

ycabrer commented Apr 4, 2020 • edited Loading

zroubalik commented Apr 7, 2020 • edited Loading

ycabrer commented Apr 7, 2020 • edited Loading

ycabrer commented Apr 7, 2020

zroubalik commented Apr 8, 2020 • edited Loading

ycabrer commented Mar 26, 2020 •

edited

Loading

zroubalik commented Mar 31, 2020 •

edited

Loading

ycabrer commented Apr 1, 2020 •

edited

Loading

ycabrer commented Apr 4, 2020 •

edited

Loading

zroubalik commented Apr 7, 2020 •

edited

Loading

ycabrer commented Apr 7, 2020 •

edited

Loading

zroubalik commented Apr 8, 2020 •

edited

Loading