-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silencing alerts in alertmanager should be ignored in kured #499
Comments
Main challenge is, that Prometheus is not aware of silences which are made in Alertmanager. To make this work we would also have to integrate Alertmanager in kured for checks. |
This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days). |
re-opening this one - it would be helpful |
@codestalkerr @justinrush Can you give some more information about what would be needed here and how this should behave? |
thinking through this more, I think we want something more like this: #385, but more generic. Ideally we can provide an arbitrary promQL query and if it has data, then it means hold off on the reboot - if its empty, it means good to go. I can create a new issue for this if it seems like something that would be acceptable to add. |
Okay. Yes, please create a new issue for this 👍 |
@ckotzbauer My thinking behind this was integrating with Alert manager coz there are times where we silence few alerts in alert manager for reasons and if Kured could also ignore those at the same time then it would have been smooth but now we create a PR to add it, so its ignored and then again to remove it when we remove silence based on situations. On side note, do we have any filter to add specific alert to block on (opposite of ignoring alert filter)? Asking this coz we have many alerts to ignore and would be nice to just block on the ones we want :) |
@codestalkerr That pretty much sounds like a negative-lookahead of regexp. Would that be an option? Golag doesn't support them, but that would be a solvable problem. |
@justinrush Would this also solve your use-case? |
Maybe? but we don't always silence in alert manager - sometimes we'll just modify the label that routes the alert to dev/null rather than a person. But i guess if we can get the label out of the alert in alertmanager and then negative regex on it, that would work? |
I see scenarios are different here and getting the label out and negative regex could work but I think it will again come down to modifying the yaml file and committing changes which I was trying to avoid. We have git ops approach and if we modify manually then the next deploys will override and maintaining that will be crazy. But feels like its a specific scenario for me maybe? |
Why it is not possible to consolidate the ways to remove/mark alerts? Are they too different to catch them with one regex which has not to be changed every time? |
Yeah so we have many different alerts and we have put that in one regex which is a huge one liner separated by or. So let's say we have some temporary issue which we expect it to stay for few hours or a day/two then we need to update that list right also we silence in alertmanager. Good thing about alert manager is that we can temporarily silence in the UI without doing any code changes and then we commit removing or adding the alert to apply to kured. |
Hi. I can have a look. |
Hi @atighineanu, |
I've created a draft, but I need more input from you regarding kured itself. Is it okay to create several more flags? |
@ckotzbauer, @dholbach any input? |
I'll have a look in a few days or next week @atighineanu |
It would be nice to have this set up where we can silence some alerts in alert manager and then those alerts should be ignored in Kured.
It would be instant and help to handle random alerts also don't have to wait for the code to be deployed for it to reboot.
The text was updated successfully, but these errors were encountered: