Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prometheus-metrics for keda-operator #3098

Closed
bamboo12366 opened this issue Jun 1, 2022 · 3 comments
Closed

add prometheus-metrics for keda-operator #3098

bamboo12366 opened this issue Jun 1, 2022 · 3 comments
Labels
feature-request All issues for new features that have not been committed to needs-discussion

Comments

@bamboo12366
Copy link
Contributor

Proposal

We already using keda in live environment. The metrics is missing for keda-operator is kind of hard for me to debug problem.
Maybe we should consider to add some metrics for it? Currently we only using the scaleObject, not using triggerAuthentication and scaleJob, what I implement in my environment for SO controller is:

  • Total scaleobject per ns
  • The rate and error rate to update hpa
  • The rate to query the so active

Use-Case

No response

Anything else?

No response

@bamboo12366 bamboo12366 added feature-request All issues for new features that have not been committed to needs-discussion labels Jun 1, 2022
@tomkerkhove
Copy link
Member

We are already tracking some in #2638, #2637 & #2639.

We are open for contributions on those.

  • The rate and error rate to update hpa
  • The rate to query the so active

Can you elaborate on these metrics and their use-case please?

@bamboo12366
Copy link
Contributor Author

The rate and error rate to update hpa

our apiserver always keeps in overloaded. The update to hpa sometimes will fail or retry, and hpa resource is key point to me, so I monitor it to check everything goes right

The rate to query the so active

I got around 1k SO in the cluster. After creating hpa resource, the one of main logic for so controller is keep checking SO, which will cause burden to other service(in our case, prometheus or kv system). It's will be easy for me to adjust the polling interval and timeout setting by monitoring the query rate and error rate for polling active

@zroubalik
Copy link
Member

Yeah, those all imho valid cases. Though would be better to close this issue and track the request (together with usecases) in the existing issue please).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request All issues for new features that have not been committed to needs-discussion
Projects
Archived in project
Development

No branches or pull requests

3 participants