Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metric keda_scaler_active never gets initialized #4945

Open
jrauschenbusch opened this issue Sep 4, 2023 · 11 comments
Open

Prometheus metric keda_scaler_active never gets initialized #4945

jrauschenbusch opened this issue Sep 4, 2023 · 11 comments
Labels
bug Something isn't working help wanted Looking for support from community stale-bot-ignore All issues that should not be automatically closed by our stale bot

Comments

@jrauschenbusch
Copy link

jrauschenbusch commented Sep 4, 2023

Report

The provided Grafana Dashboard does not work because the required metric keda_scaler_active is not rendered on the /metrics endpoint. Therefore, the dashboard variables for the dropdown boxes cannot be evaluated correctly. The reason is that not all metrics are correctly initialized at bootstrapping time (first recording with default values). In addition, if you use ScaledObjects exclusively with CPU/memory triggers, the following metrics are not recorded at all:
- keda_scaler_active
- keda_scaler_errors
- keda_scaler_metrics_value
- keda_scaler_metrics_latency

Expected Behavior

  1. All metrics are initialized after the bootstrapping phase with a default value so that they are rendered when requesting the /metrics endpoint so that Grafana dashboard variables can be evaluated correctly
  2. Metrics should also be recorded when only CPU/Memory triggers are set (currently skipped)
    • keda_scaler_active
    • keda_scaler_errors
    • keda_scaler_metrics_value
    • keda_scaler_metrics_latency

Actual Behavior

Metrics are missing if one is just using ScaledObjects with CPU/Memory triggers only

Steps to Reproduce the Problem

  1. Install KEDA v2.11.2
  2. Create ScaledObject with CPU/Memory trigger only
  3. Request /metrics endpoint of Operator

Logs from KEDA operator

No response

KEDA Version

2.11.2

Kubernetes Version

1.25

Platform

Microsoft Azure

Scaler Details

No response

Anything else?

No response

@jrauschenbusch jrauschenbusch added the bug Something isn't working label Sep 4, 2023
@JorTurFer
Copy link
Member

JorTurFer commented Sep 4, 2023

Hello!

In addition, if you use ScaledObjects exclusively with CPU/memory triggers, the following metrics are not recorded at all:

  • keda_scaler_active
  • keda_scaler_errors
  • keda_scaler_metrics_value
  • keda_scaler_metrics_latency

You are right but there isn't any solution for that because CPU/Memory scalers are a wrapper over the k8s metrics server, so KEDA doesn't have those values because KEDA doesn't process them, it's the k8s metrics server who does it.

Honestly, I think that KEDA shouldn't take the control over those resources because the metrics.k8s.io api is reserved for k8s metrics server and it exposes some metrics like container or resource, and KEDA doesn't support them. I mean, KEDA can't know the cpu metric value if KEDA doesn't expose it, but IMHO KEDA shouldn't expose it instead of the current k8s metrics server.
WDYT @zroubalik @tomkerkhove ?

All metrics are initialized after the bootstrapping phase with a default value so that they are rendered when requesting the /metrics endpoint so that Grafana dashboard variables can be evaluated correctly****

I think that we could do it, but there are some metrics attached to external resources for labeling, such as scaledobeject or scaledjobs, IDK if we can initialize them empty, but definitively KEDA should try to do something with them.
Are you willing to contribute with this improvement?

@jrauschenbusch
Copy link
Author

You are right but there isn't any solution for that because CPU/Memory scalers are a wrapper over the k8s metrics server, so KEDA doesn't have those values because KEDA doesn't process them, it's the k8s metrics server who does it.

Go it. But imho at least the dashboard should still work out of the box. Maybe by evaluating the metric keda_scaled_object_errors instead of keda_scaler_active. Or there should be a suggestion to use a plain HPA-metrics-based dashboard for cpu/memory/cron-based scalers only. Maybe this should be a note in the Integrate with Prometheus page. Additionally a note for this behavior should also be present. I mean that some metrics will not appear if there are no external metrics used.

I think that we could do it, but there are some metrics attached to external resources for labeling, such as scaledobeject or scaledjobs, IDK if we can initialize them empty, but definitively KEDA should try to do something with them.
Are you willing to contribute with this improvement?

You mean that for those missing metrics also the labels must be intialized? An alternative could be to just initialize the keda_scaler_active metric within the skip-operation. It would mean that the dashboard would work out-of-the-box, but just a few charts will not be rendered as those metrics are not managed by KEDA.

Move

prommetrics.RecordScalerActive(scaledObject.Namespace, scaledObject.Name, scalerName, scalerIndex, metricName, isMetricActive)
below
isScaledObjectActive = true

@tomkerkhove tomkerkhove added the help wanted Looking for support from community label Sep 14, 2023
Copy link

stale bot commented Nov 13, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Nov 13, 2023
@jonathanirvin-ast
Copy link

I'm also seeing this issue and would love to get some help.

@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Nov 17, 2023
@jrauschenbusch
Copy link
Author

I'm also seeing this issue and would love to get some help.

Use a plain Horizontal Pod Autoscaler (HPA) dashboard for CPU/Memory based scaled objects. Or do it like me and fix the dashboard by using HPA metrics instead of KEDA metrics. I removed some KEDA specific panels and modified the rest.

@Sindvero
Copy link

Sindvero commented Jan 2, 2024

Can you share an example of such dashboard by any chance please?

Copy link

stale bot commented Mar 2, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Mar 2, 2024
Copy link

stale bot commented Mar 11, 2024

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Mar 11, 2024
@zroubalik zroubalik reopened this Mar 11, 2024
Copy link

stale bot commented Mar 21, 2024

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Mar 21, 2024
@JorTurFer JorTurFer reopened this Mar 24, 2024
@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Mar 24, 2024
@JorTurFer JorTurFer added the stale-bot-ignore All issues that should not be automatically closed by our stale bot label Mar 24, 2024
@Dentrax
Copy link

Dentrax commented Aug 6, 2024

@JorTurFer Thanks for the response there. We were wondering if this is still a valid issue.


We have been trying to utilize KEDA and while working on the dashboard stuff, we encountered this issue.

If we create a ScaledObject that has CPU/MEM + other triggers, we actually able to see the metrics except CPU/MEM on the keda_scaler_active metric. If you put only one trigger, for example, CPU, you won't able to see keda_scaler_active at all.

If we dive into further, here is the func calls:

  1. r.updatePromMetrics(scaledObject, req.NamespacedName.String())
  2. for _, trigger := range scaledObject.Spec.Triggers {
    metricscollector.IncrementTriggerTotal(trigger.Type)

Happy to provide reproducible steps if requested.

Could you please clarify in case if we're missing something obvious?

/cc @yasinterol

@JorTurFer
Copy link
Member

Hello,
You will see only the external metrics, as they are the metrics managed by KEDA. for CPU/Memory, KEDA registers the HPA with a pod metric (that's why you need the k8s metrics server for them). It means that KEDA doesn't have any knowledge about them and the value or errors won't be exposed.

Maybe we should improve the docs about the metric to clarify that CPU and memory will never be included in those metrics that implies current values or errors (active implies the value too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Looking for support from community stale-bot-ignore All issues that should not be automatically closed by our stale bot
Projects
Status: Proposed
Development

No branches or pull requests

7 participants