-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide operational insights on # of triggers per trigger type #3663
Comments
Would like to hear some thoughts on this from @JorTurFer @zroubalik |
I think that we can add these metrics about adoption, it's true that from monitoring pov could not give a lot of information, but from adoption pov it gives. I mean, as a cluster operator I could want to know the aggregated values of trigger types to apply some improvements, .e.g: If I only have 1 azure scaler, maybe doesn't make sense to use managed identities. In general, I don't have any problem with improving the metrics we offer because they are private inside your cluster and we don't collect them so each admin is able to collect them or not. |
I can work on implementing this. |
Reposting a query from Slack - The value for this metric might change every time a |
I'd expose them on the operator instead of the metrics adapter as it's not related to the metric adapter at all. So I'd introduce a new endpoint there |
Well, if we're okay with exposing a new endpoint on the operator, then it shouldn't be a big task (I think?). Should the metric be named |
Don't expose another endpoint in the operator please, we are already exposing a single endpoint with the runtime-metrics (operator-sdk does it). Recently we have talked about unifying them in the metrics server, so let's do it directly in this case |
Basically, instead of starting another server, you should register the metrics in the already existing server by prometheus global registry |
The link mentions this -
But the reconcile loop is running within a separate pod, so this won't work in our case? Or am I misunderstanding something from the docs? |
You asked about adding this metric to the operator, there you have reconciliation loops xD |
Oh, so you're suggesting that we should reuse the Prometheus endpoint used for the runtime metrics in the I just got confused and thought that you wanted to not expose the metrics in the operator and expose them in the adapter itself. |
Agreed, sorry I was more referring to exposing them on the operator rather than introducing a new endpoint. |
Lost in translation xD |
Each one of us misunderstood the others 😄 |
Proposal
Based on the current docs, we provide the following metrics today:
keda_metrics_adapter_scaler_error_totals
- The total number of errors encountered for all scalers.keda_metrics_adapter_scaled_object_error_totals
- The number of errors that have occurred for each scaled object.keda_metrics_adapter_scaler_errors
- The number of errors that have occurred for each scaler.keda_metrics_adapter_scaler_metrics_value
- The current value for each scaler’s metric that would be used by the HPA in computing the target average.Based on these, I don't see a good fit to extend them with good labels. Another reason for it is that ScaledObjects/ScaledJobs with multiple triggers of the same type would not be reflected correctly.
That's why I propose to introduce
keda_metrics_adapter_trigger_totals
metric with atype
label that is the type of trigger being used, for examplecron
.Use-Case
Gain insights on how many triggers are using a given trigger type.
This helps you better understand your autoscaling landscape and its dependencies.
Anything else?
No response
The text was updated successfully, but these errors were encountered: