Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing orphaned alerts situations #164059

Closed
1 of 2 tasks
shanisagiv1 opened this issue Aug 16, 2023 · 5 comments
Closed
1 of 2 tasks

Addressing orphaned alerts situations #164059

shanisagiv1 opened this issue Aug 16, 2023 · 5 comments
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@shanisagiv1
Copy link

shanisagiv1 commented Aug 16, 2023

In order to address this issue - #159462 , we have decided to split the effort into 2 phases. With this approach we'll address also the #157256 request

  • 8.11, first phase:

    • Support bulk action for Manual “untracked” , which means users can manually change “active” alerts to “untracked” with the following consequences:
      • The alert table is cleaned up from orphaned active alerts (main gap)
      • Support a new table filter for untracked alerts
      • The alert actions won’t be triggered when marked as “untracked”. We’ll notify (UI note) the user about this behavior since
        target systems won’t be updated. (I think it makes sense since it’s called “untracked” so users won’t expect to recovery
        actions to be triggered)
    • When a rule is deleted/disabled, we will do the same and mark them as “untracked” automatically. (again, no triggered actions)
  • 8.11+, 2nd phase (Trigger action or not by user configuration)

    • Support another manual bulk action for “alert recovery”
      This manual action will trigger the alerts in opposed to the “untracked”
    • Optional - We’ll let users to define in the global setting if actions should be sent for “auto untracked”
@botelastic botelastic bot added the needs-team Issues missing a team label label Aug 16, 2023
@shanisagiv1 shanisagiv1 added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) and removed needs-team Issues missing a team label labels Aug 16, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@XavierM
Copy link
Contributor

XavierM commented Aug 16, 2023

Thanks you @shanisagiv1 that's exactly what the team talked about

@XavierM XavierM moved this from Awaiting Triage to Todo in AppEx: ResponseOps - Rules & Alerts Management Aug 21, 2023
Ikuni17 pushed a commit that referenced this issue Sep 27, 2023
## Summary
Part of #164059 

Implements the `Untracked` lifecycle status, and applies it to alerts
when their corresponding rule is disabled.

<img width="1034" alt="Screenshot 2023-08-24 at 4 24 45 PM"
src="https://github.com/elastic/kibana/assets/1445834/4d31545d-9fc0-4eb3-9972-72685107184d">
<img width="904" alt="Screenshot 2023-08-24 at 4 56 32 PM"
src="https://github.com/elastic/kibana/assets/1445834/3d7cfa19-5aca-4148-a9bc-d0d0c949d84b">
<img width="820" alt="Screenshot 2023-08-24 at 4 56 17 PM"
src="https://github.com/elastic/kibana/assets/1445834/e59870c8-4140-4588-893a-f3f54170f78a">


### Checklist

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
XavierM added a commit that referenced this issue Oct 4, 2023
## Summary

Part of #164059

<img width="301" alt="Screenshot 2023-09-28 at 5 38 45 PM"
src="https://github.com/elastic/kibana/assets/1445834/1b9ae224-7dad-43d7-a930-adf9458e1613">
<img width="486" alt="Screenshot 2023-09-28 at 5 38 11 PM"
src="https://github.com/elastic/kibana/assets/1445834/82eeec3d-af2c-4257-b78e-99aea5a6b66f">

This PR:

- Moves the `setAlertStatusToUntracked` function from the `AlertsClient`
into the `AlertsService`. This function doesn't actually need any Rule
IDs to do what it's supposed to do, only indices and Alert UUIDs.
Therefore, we want to make it possible to use outside of a created
`AlertsClient`, which requires a Rule to initialize.
- Creates a versioned internal API to bulk untrack a given set of
`alertUuids` present on `indices`. Both of these pieces of information
are readily available from the ECS fields sent to the alert table
component, from where this bulk action will be called.
- Switches the `setAlertStatusToUntracked` query to look for alert UUIDs
instead of alert instance IDs.
#164788 dealt with untracking
alerts that were bound to a single rule at a time, but this PR could be
untracking alerts generated by many different rules at once. Multiple
rules may generate the same alert instance ID names with different
UUIDs, so using UUID increases the specificity and prevents us from
untracking alert instances that the user didn't intend.
- Adds a `bulkUpdateState` method to the task scheduler.
#164788 modified the `bulkDisable`
method to clear untracked alerts from task states, but this new method
allows us to untrack a given set of alert instances without disabling
the task that generated them.

#### Why omit rule ID from this API?

The rule ID is technically readily available from the alert table, but
it becomes redundant when we already have immediate access to the alert
document's index. #164788 used the
rule ID to get the `ruleTypeId` and turn this into a corresponding
index, which we don't have to do anymore.

Furthermore, it helps to omit the rule ID from the `updateByQuery`
request, because the user can easily select alerts that were generated
by a wide variety of different rules, and untrack them all at once. We
could include the rule ID in a separate `should` query, but this adds
needless complexity to the query.

We do need to know the rule ID after performing `updateByQuery`, because
it corresponds to the task state we want to modify, but it's easier to
retrieve this using the same query params provided.

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Jiawei Wu <jiawei.wu@cmd.com>
Co-authored-by: Xavier Mouligneau <xavier.mouligneau@elastic.co>
@XavierM XavierM moved this from Todo to Up for grabs in AppEx: ResponseOps - Rules & Alerts Management Oct 4, 2023
@nerophon
Copy link
Contributor

@shanisagiv1 On this topic, an enterprise user has said the following:

Sometimes when we delete the rule associated with an alert, we want to be able to deal with them afterwards, but usually we've decided that the alerts that rule generates are no longer required, and we'd rather not have them in our alerts overview forever alongside current or upcoming but untracked alerts.

Given this, we should probably ensure there is both a way to recover alerts, but also a way to let them be permanently removed if that is desired.

@Zacqary
Copy link
Contributor

Zacqary commented Jan 30, 2024

@nerophon Noted this in #175916

@Zacqary
Copy link
Contributor

Zacqary commented Feb 6, 2024

Closing in favor of #175916, we will track Phase 2 there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants