-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to throttle alert instances until action group changes #50077
Comments
Pinging @elastic/kibana-stack-services (Team:Stack Services) |
Instead of using "throttle" as a terminology, this could be a hook for firing actions only when entering an action group. This could also be used by maps for "entering" and "leaving" a geofenced area instead of notifying (w/ a throttle) when someone is "within" or "outside" a geofenced area. |
mentioned here (#81631) as well, but this would be very useful for a planned "containment"-option of the geo-threshold alert (#80749). cc @kmartastic @aaronjcaldwell |
@mdefazio Are there mockups for this? |
If I'm reading this correctly, there seems to be two different suggestions:
I think #2 makes the most sense to me, but I wanted to make sure I've read this correctly. |
I'm thinking option 2 as well. At least we can add option 1 on top of the solution in the future if we prefer both. I would be curious of what @arisonl thinks as well. |
For some reason, I thought we were just going to change to always do this. Throttle would "break" once the action group changes. If you want "throttle even if alert group changes", I think we were leaving that to the not-yet-implemented "snooze" setting (a timed mute). Probably worth putting a doc together on all the things we're thinking about here, which would include current behavior, things like "snooze", and scheduled mutes. I'm a little worried about the combinatorial explosion of all these potential throttle/mute settings ... |
Good idea @pmuellr This is how I understand the current behavior (please correct me if I'm wrong).
With #82274, we can now assign actions to different action groups, but that doesn't change any of the above behavior. With #82412, we will be able to specify conditions per action group, but it's TBD whether changes to the above behavior will be included with that issue.
|
I believe changing action groups will reset the throttle window and fire actions as soon as the action group changes. |
Gotcha. So if we add a switch or checkbox with this issue, the behavior would be:
|
That's correct based on how I recall / envisioned it.
I wonder if we can use this meta issue: #67597. You are right, there will be lots of combinations in the future. |
I do not have mocks for this yet. But the behavior you describe seems to make sense to me. |
@ymao1 The following two comments capture the user requirements on when to trigger (assuming unmuted) and the perceived state changes from a user perspective:
Your latest comment (#50077 (comment)) is consistent with these requirements, assuming we have the required set of transitions (action groups) in place. In addition:
If |
@arisonl You are correct, if if throttle is defined, action(s) will fire when (the alert instance meets the alert criteria AND the throttle time has elapsed since the last fired action) OR (the action group has changed) |
I'm wondering if the terminology should be I also wanted to clarify, "state" === "action group" so when mentioning state change, it means changing action groups. I'm ok if the technical terms change to "action group" to avoid further confusion. |
@mikecote Thank you. I think that clears up a point of possible confusion that @pmuellr brought up: what is the |
Just a note that my thoughts on the schedule-based (ie, cron) silencing of alerts is that it's all muting, not throttling. Feels like schedule-based throttling is going to become a bit of a mind-bender to customers, probably confusing if they do or don't see notifications they weren't expecting, then have to reason backwards as to why. Muting is more of a boolean, seems like a less complex thing to reason about. Select an "calendar entry" of some kind, mute it, now you know - you will NOT get notifications during that period. It's a simplification anyway :-) |
cc @arisonl it would be worth figuring out the default behaviour. Do users want to be "notified every" by default or just when the action group changes? I can see where the throttle is opt-in but there's also that weird use case of not throttling and always notify.. |
So the unchecked case in the mockup ^^^ is to throttle only on action group changes, I guess? And there's no way to get the old behavior of It seems like this is going to have to end up being a three-state selector, somehow:
Interesting that if we do it like that, you could "toggle" Alternatively, |
Correct, if it was labeled "Re-notify every", I think it would be more clear. Unchecked = don't re-notify.
Yeah, it inherits the previous problem where it's not clear null means notify all the time (not sure if others share the same). In theory null is the same as having the same value as the interval, maybe more clear to change it to that or something?
This is where I wished we had telemetry or maybe we could compare default behaviour of other alerting systems to see if they re-notify by default or just the first time. It's mostly to make sure we have the proper default between these 3 options and the user understand / expects the default behaviour.
Another option could be to present 3 radio buttons and come up with clear labels of the 3 options the user has for how to want to be notified. |
I like the idea of explicitly presenting the 3 options somehow (with some additional descriptive words) instead of relying on the user to figure out what combination of checked/unchecked state and throttle input gets the behavior they want. The 3 radio buttons with clear labels would work. Or maybe a dropdown with 3 values:
and when the third option is selected, the throttle inputs show up. |
I have started some designs for this, but I will try these new approaches as well. Knowing the right defaults is definitely an important piece of this. And how often users will want to modify it (Is the check interval more important and throttling can remain at our defaults most of the time? In which case, is interval exposed and throttling is behind an additional click?) |
I'm happy changing the default, probably to "notify on action group changes". Both no throttle and throttle on a different interval always seemed kinda clunky to me, probably not what you want most of the time. |
I've updated the mockups after some additional help from @ymao1 : I think we should also move this to the bottom of the create flyout since this now takes up quite a bit of space. (Hopefully the typical defaults would be good enough for most and they won't have to mess with this). |
Those views look good to me.
That's not new :-). It's always been possible to set illogical throttle values. It's not clear how much validation we should be doing on these, I sort of feel like we should have some visualization or better "explainer" for all the time-related params/config things. That would be the alert interval, throttle, the mythical "snooze", and any alert-specific values like index threshold's Perhaps what we should do in lieu of "validation" is present a warning/hint for known illogical values, like if the throttle is less than the alert interval. In any case, worthy of another issue, we don't need to "solve" this problem here.
I'm leery of separating the alert interval from the throttle - so if the suggestion is we move both, then +1. Otherwise, if interval is at the top, and threshold is at the bottom, doesn't feel right. |
Yes, apologies for the confusion. My thought was that the whole block shown in the screenshots would move to the bottom |
Perfect - I think we discussed that in a previous design meeting, and seemed good to me. |
…t and side car notifications (Part 1) (#109722) ## Summary Removes the "side car" actions object and side car notification (Part 1). Part 1 makes it so that newly created rules and editing existing rules will update them to using the new side car notifications. Part 2 in a follow up PR will be the migrations to move the existing data. The saved object side we are removing usages of is: ``` siem-detection-engine-rule-actions ``` The alerting side car notification system we are removing is: ``` siem.notifications ``` * Removes the notification files and types * Adds transform to and from alerting concepts of `notityWhen` and our `throttle` * Adds unit tests for utilities and pure functions created * Updates unit tests to have more needed jest mock * Adds business rules and logic for the different states of `notifyWhen`, and `throttle` on each of the REST routes to determine when we should `muteAll` vs. not muting using secondary API call from client alerting * Adds e2e tests for the throttle conditions and how they are to interact with the kibana-alerting `throttle` and `notifyWhen` A behavioral change under the hood is that we now support the state changes of `muteAll` from the UI/UX of [stack management](https://www.elastic.co/guide/en/kibana/master/create-and-manage-rules.html#controlling-rules). Whenever the `security_solution` ["Perform no actions"](https://www.elastic.co/guide/en/security/current/rules-api-create.html ) is selected we do a `muteAll`. However, we do not change the state if all individual actions are muted within the rule. Instead we only maintain the state of `muteAll`: <img width="2299" alt="ui_state_change" src="https://user-images.githubusercontent.com/1151048/130823045-48a9f34b-db23-44e3-b9ed-cbbb57edc3d6.png"> <img width="1163" alt="no_actions_state_change" src="https://user-images.githubusercontent.com/1151048/130823056-3f8953fa-9433-4973-a2d3-6e11263b9619.png"> Ref: * Issue and PR where notifyWhen was added to kibna-alerting * #82969 * #50077 ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…t and side car notifications (Part 1) (elastic#109722) ## Summary Removes the "side car" actions object and side car notification (Part 1). Part 1 makes it so that newly created rules and editing existing rules will update them to using the new side car notifications. Part 2 in a follow up PR will be the migrations to move the existing data. The saved object side we are removing usages of is: ``` siem-detection-engine-rule-actions ``` The alerting side car notification system we are removing is: ``` siem.notifications ``` * Removes the notification files and types * Adds transform to and from alerting concepts of `notityWhen` and our `throttle` * Adds unit tests for utilities and pure functions created * Updates unit tests to have more needed jest mock * Adds business rules and logic for the different states of `notifyWhen`, and `throttle` on each of the REST routes to determine when we should `muteAll` vs. not muting using secondary API call from client alerting * Adds e2e tests for the throttle conditions and how they are to interact with the kibana-alerting `throttle` and `notifyWhen` A behavioral change under the hood is that we now support the state changes of `muteAll` from the UI/UX of [stack management](https://www.elastic.co/guide/en/kibana/master/create-and-manage-rules.html#controlling-rules). Whenever the `security_solution` ["Perform no actions"](https://www.elastic.co/guide/en/security/current/rules-api-create.html ) is selected we do a `muteAll`. However, we do not change the state if all individual actions are muted within the rule. Instead we only maintain the state of `muteAll`: <img width="2299" alt="ui_state_change" src="https://user-images.githubusercontent.com/1151048/130823045-48a9f34b-db23-44e3-b9ed-cbbb57edc3d6.png"> <img width="1163" alt="no_actions_state_change" src="https://user-images.githubusercontent.com/1151048/130823056-3f8953fa-9433-4973-a2d3-6e11263b9619.png"> Ref: * Issue and PR where notifyWhen was added to kibna-alerting * elastic#82969 * elastic#50077 ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…t and side car notifications (Part 1) (#109722) (#110305) ## Summary Removes the "side car" actions object and side car notification (Part 1). Part 1 makes it so that newly created rules and editing existing rules will update them to using the new side car notifications. Part 2 in a follow up PR will be the migrations to move the existing data. The saved object side we are removing usages of is: ``` siem-detection-engine-rule-actions ``` The alerting side car notification system we are removing is: ``` siem.notifications ``` * Removes the notification files and types * Adds transform to and from alerting concepts of `notityWhen` and our `throttle` * Adds unit tests for utilities and pure functions created * Updates unit tests to have more needed jest mock * Adds business rules and logic for the different states of `notifyWhen`, and `throttle` on each of the REST routes to determine when we should `muteAll` vs. not muting using secondary API call from client alerting * Adds e2e tests for the throttle conditions and how they are to interact with the kibana-alerting `throttle` and `notifyWhen` A behavioral change under the hood is that we now support the state changes of `muteAll` from the UI/UX of [stack management](https://www.elastic.co/guide/en/kibana/master/create-and-manage-rules.html#controlling-rules). Whenever the `security_solution` ["Perform no actions"](https://www.elastic.co/guide/en/security/current/rules-api-create.html ) is selected we do a `muteAll`. However, we do not change the state if all individual actions are muted within the rule. Instead we only maintain the state of `muteAll`: <img width="2299" alt="ui_state_change" src="https://user-images.githubusercontent.com/1151048/130823045-48a9f34b-db23-44e3-b9ed-cbbb57edc3d6.png"> <img width="1163" alt="no_actions_state_change" src="https://user-images.githubusercontent.com/1151048/130823056-3f8953fa-9433-4973-a2d3-6e11263b9619.png"> Ref: * Issue and PR where notifyWhen was added to kibna-alerting * #82969 * #50077 ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios Co-authored-by: Frank Hassanabad <frank.hassanabad@elastic.co>
This feature request would be to stop being reminded of an alert after it fires. Currently users can disable throttling (throttle:
null
) which causes alerts to fire actions every time or they can throttle alerts for a certain period of time (ex: throttle:5m
).When throttling for a time window, it is considered as "remind me every X minutes" style feature. This request would be to "stop reminding me about this until the action group changes".
For example an alert that constantly fires
default
action group would stop firing actions until the group changes to something other thandefault
. This would be the same as only firing actions when the action group changes. This would be useful for alerts that execute actions that are non-idempotent.This feature could instead be configured at the alert type level instead? This would mean when defining an alert type, you could configure the throttling behaviour.
cc @mdefazio to work on mock ups for this.
Some early thought; adding a checkbox next to "Notify every" in the UI. When unchecked, alert instances are throttled until the state changes.
The text was updated successfully, but these errors were encountered: