-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter Prometheus Alerts by regexp is limited, add feature to query by labels #385
Comments
Should we close this? If there is extra work to do, please mention it :) |
hi @evrardjp. thanks for coming back to this.
What do you think of this? |
As you can guess with the delay in my answers: I am not really available to work on this. I am not really familiar with prometheus internals, but I think it's worth investigating, and documenting WHY we want to change or NOT change. Example: Is the /alerts endpoing better? Why? Should we have a sidecar model instead, or should we keep this in code? Agree on making things uniform, but not sure what you are proposing here. |
I can image, I'd like to work on it, but this comes between work/kids. I have to say I don't know the internals of Prometheus either, I'm learning as I go, and by this I'm seeing things we can use, like the Alerts endpoint. In the end we'd like to filter by labels, regardless of endpoint. About the uniform way, I meant.. The way Prometheus is queried: Interesting point, how would you setup a sidecar model for this issue? |
The blocker tools could be custom made (and we can provide a few defaults), which would apply a label on nodes, a label we can watch. |
This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days). |
Can this one please be reopened. |
This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days). |
Currently we'd like to query Prometheus for certain Active Alerts to block nodes from being rebooted, this functionality is present. It's great to have this functionality in place, However filtering by regex is limited in a way we can't use it.
The situation is that we have different priorities of workloads getting the same alerts, that is;
If teams are experimenting with their setup and it fires an alert, lets say 'PodCrashing', we don't want to Kured to be blocked by this Alert.. But we do want to block Kured when this alert is fired on the workloads of the kubernetes-infra team (as this hits all teams).
Rather than use different names for alerts per priority, it would be great if we could query Prometheus for active Alerts by 'labels'. This way we could utilize labels like: "severity":"critical","team":"kubernetes-infra" or something in line of this.
Because we'd like to only block Kured on specific situations, I think it a hit on the query of labels should block Kured to reboot, thus doing the opposite of the regex.
I'm not sure if this would be a good feature to introduce, but I've been working on implementing this in code (with a bunch of unit-tests) and at the moment starting to test it on KiND. I'll link this to a PR #386 .
The text was updated successfully, but these errors were encountered: