[RAC] Alerts as Data Bulk Insert #93730

spong · 2021-03-05T02:57:30Z

This issue is for discussing the architecture/implementation for writing out Alerts as Data.

Relevant implementations from within detections include:

And a plethora of other utilities and helpers found in:
kibana/x-pack/plugins/security_solution/server/lib/detection_engine/signals/

The above implementations can be spread out as they cover many complex use cases, like creating alerts for aggregations, EQL sequences, alerts on alerts, and logic for preventing the creation duplicate alerts (among others :). Some patterns have started forming as we've been adding new rule types and features over the last year, but until now we've yet to have the opportunity to abstract further and clean up the control flow.

As discussed with @mikecote, @sqren and @tsg, a library implementation providing a hook for alert creation that each rule could provide their own implementation to (and call into other library utils like exceptions, deduplication, etc) would go a long way in extending the initial implementation within Detections.

Open questions

How will the existing paradigm of alert/alert instances port over to alerts as data? For example, in the o11y use case, a CPU Threshold Exceeded rule may write an alert, then later update that alert if it is triggered again. Will we follow this same paradigm, or are alerts immutable except for assignment/status like in the security use case? If the latter, how will what is now many alerts for one incident be displayed in the triage workflow since we don't support groupings/aggregations? Each new trigger creates a new alert with aggregate information from the predecessors and automatically closes them so there's only ever one active alert? Building block alerts for each trigger, and the shell alert is mutable?

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-03-05T02:57:31Z

Pinging @elastic/security-detections-response (Team:Detections and Resp)

elasticmachine · 2021-03-05T02:57:31Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

elasticmachine · 2021-03-05T02:57:32Z

Pinging @elastic/security-solution (Team: SecuritySolution)

pmuellr · 2021-03-05T16:44:41Z

This may not be relevant, but thought I'd mention it.

For the current event log, we currently queue up events to write, in bulk, async. However the public interface is a simple synchronous logEvent(event) style call that returns nothing. We buffer over time and space (1 sec or 100 events, whichever happens first). We also handle trying to get these written out during an "normal" Kibana shutdown (plugin shutdown is await'd by the platform (with a timeout)!) - otherwise there's a good chance we'd lose the last few records that were left in the buffer. As of 7.11 or so, you should start seeing BOTH the event log startup event AND the event log shutdown event written to the event log - we've always been writing the event log shutdown event, but didn't handle the actual "write the last buffer during shudown" till then, so before that, it was never written.

We've always treated the event log data as "not critical" - for the cases we query it in alerting (to show alert details), we handle cases where data is missing for some reason. Beyond bugs / timing issues / etc, that reason could be that the data got ILM'd away in a delete phase. I'm guessing we want to make the "alerts as data" a little more "critical" :-), but not quite sure what that means, because if we buffer the data, and then Kibana crashes with an OOM or SIGSEGV, you're going to lose that last set of buffered data.

All that code (there's not much) to deal with the event log buffering is here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/event_log/server/es/cluster_client_adapter.ts#L52-L106

**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces

**Needed for:** rule execution log for Security elastic#94143 **Related to:** - alerts-as-data: elastic#93728, elastic#93729, elastic#93730 - RFC for index naming elastic#98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see elastic#98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (elastic#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of elastic#98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (elastic#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces

**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces

peluja1012 · 2021-09-14T23:26:54Z

Closing in favor of Alerts as Data RFC doc.

This was referenced Mar 9, 2021

[RAC] Alerts as Data Schema Definition #93728

Closed

[RAC][Alerts as Data] Adds RAC plugin #94663

Closed

peluja1012 assigned spong Mar 23, 2021

spong mentioned this issue Mar 30, 2021

[RAC] Alerts as Data Meta #95736

Closed

7 tasks

banderror mentioned this issue May 11, 2021

[RAC] Rule monitoring: Event Log for Rule Registry #98353

Merged

8 tasks

peluja1012 closed this as completed Sep 14, 2021

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RAC] Alerts as Data Bulk Insert #93730

[RAC] Alerts as Data Bulk Insert #93730

spong commented Mar 5, 2021 •

edited

Loading

elasticmachine commented Mar 5, 2021

elasticmachine commented Mar 5, 2021

elasticmachine commented Mar 5, 2021

pmuellr commented Mar 5, 2021

peluja1012 commented Sep 14, 2021

[RAC] Alerts as Data Bulk Insert #93730

[RAC] Alerts as Data Bulk Insert #93730

Comments

spong commented Mar 5, 2021 • edited Loading

Open questions

elasticmachine commented Mar 5, 2021

elasticmachine commented Mar 5, 2021

elasticmachine commented Mar 5, 2021

pmuellr commented Mar 5, 2021

peluja1012 commented Sep 14, 2021

spong commented Mar 5, 2021 •

edited

Loading