-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add alert rules for mso #94
Conversation
@fpetkovski @sthaha PTAL |
21e72e6
to
3221958
Compare
This looks good so far. Do we want to monitor additional metrics, e.g. queue depth? |
Ya I will add other rules as well, created initial PR with only one just to get an idea. Also since we are not using jsonnet how do we want to include other alerts like monitoring prometheus-operator? |
Yes, that's a good point. We should include PO alerts as well. |
@fpetkovski I added couple of alerts as well, PTAL. I can create separate PR for Prometheus Operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice patch @slashpai !
Looks good overall. Some minor corrections
1535e2f
to
33ad012
Compare
Thanks Sunil. I updated with changes, PTAL |
This commit adds initial rule file for monitoring-stack-operator Signed-off-by: Jayapriya Pai janantha@redhat.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
@fpetkovski are we good to merge? or anymore change is required :) |
summary: Monitoring Stack Operator controller - {{ $labels.controller }} reconcilation takes too long to reconcile | ||
expr: | | ||
rate(controller_runtime_reconcile_time_seconds_sum{job="monitoring-stack-operator"}[5m]) / | ||
rate(controller_runtime_reconcile_time_seconds_count{job="monitoring-stack-operator"}[5m]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a rate of a count in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok so this is the average loop time, makes sense 👍
This commit adds initial rule file for monitoring-stack-operator
Signed-off-by: Jayapriya Pai janantha@redhat.com
Description:
Add alert rules for controller, create service and servicemonitor