Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections][Meta] Modularize the Detection Engine #93550

Closed
7 of 13 tasks
spong opened this issue Mar 4, 2021 · 6 comments
Closed
7 of 13 tasks

[Security Solution][Detections][Meta] Modularize the Detection Engine #93550

spong opened this issue Mar 4, 2021 · 6 comments
Labels
Feature:Detection Alerts Security Solution Detection Alerts Feature Meta refactoring Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete v7.13.0

Comments

@spong
Copy link
Member

spong commented Mar 4, 2021

This is the meta ticket for tracking the modularization of the Detection Engine. The below is our first steps in supporting RAC (Rules/Alerts/Cases) everywhere, and all efforts are still open for discussion. 🙂

High level feature-sets

Exceptions

Within the main executor, the exceptions logic can be specific to certain rule types (e.g. createThreatSignals() & buildEqlSearchRequest()), added generically as an esFilter pre-query (threshold rules), or applied as a post-filter (e.g.filterEventsAgainstList() for ML rules).

Alert De-duplication

The alert de-duplication logic currently lives within single_bulk_create, and signal_rule_alert_type for EQL rules.

Gap Detection Remediation

Lives within signal_rule_alert_type and is injected into each rule type logic so they can perform the desired searches over the calculated gaps.

Monitoring Efforts

Removal of side-car SO for Rule Execution monitoring in favor of leveraging the Alerting Event Log #94143


Task Breakdown

Potential additional efforts:

  • ...
  • Move eventsTelemetry logic out of searchAfterBulkCreate and up to top level so we get telemetry for all rule types (low priority)
Reference docs (internal):
@spong spong added refactoring Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Feature:Detection Alerts Security Solution Detection Alerts Feature v7.13.0 Theme: rac label obsolete labels Mar 4, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@marshallmain
Copy link
Contributor

Next refactoring steps:

  • finish work on Bulk create #99849
  • Allow buildSignalGroupFromSequence and buildSignalFromEvent to be passed in to EQL executor function in preparation for sharing with rule registry EQL implementation
    • buildSignalFromEvent and buildBulkBody are almost identical, can be merged into one function as part of the Remove near-duplicate functions (buildRule, bulkCreate) (TBD) task
  • Move gap remediation logic out of searchAfterAndBulkCreate (issue describing bug around maxSignals and gap detection with threat match rules)
  • Remove dependencies on signalRuleAlertType specific utilities
    • RuleStatusService
      • EQL, ML, Threshold have dependency
      • warnings and errors can be returned from the executor functions and handled at the top level signalRuleAlertType. rule registry can use the new event log service to handle the returned warnings and errors
    • Any other dependencies that won't exist in rule registry, buildRuleMessage maybe?
  • Move eventsTelemetry logic out of searchAfterBulkCreate and up to top level so we get telemetry for all rule types? I haven't worked with this before so not 100% clear on this.

@marshallmain
Copy link
Contributor

Additional step:

  • Threshold rules depend on querying existing threshold signals to do duplicate mitigation, and they use ruleParams.outputIndex as the index to query. This index will no longer exist in the RAC implementation, so we'll need to conditionally switch to querying the .alerts index and processing alerts in that index appropriately.

@madirey
Copy link
Contributor

madirey commented Jun 9, 2021

@madirey
Copy link
Contributor

madirey commented Jun 26, 2021

Next steps:

  • Component template index mapping upgrades (PR)
  • Rule data client changes / consolidation @banderror @xcrzx
    • rule_registry bootstrapping issues / race conditions
      • enforce documents aren't written until component templates are applied
      • error handling? if index bootstrapping fails, we should not be able to write to indices (otherwise we end up with
    • allow template to be passed in to service to receive scoped client instance (issue)
    • Converge RuleDataClient and AlertDataClient to fetch and update alerts? / sync with RBAC
      dynamic mappings)
  • Complete a reference rule implementation end-to-end (Custom Query) (PR) @madirey
    • Write BaseSecurity rule type that will do
      • gap detection
      • privilege checks
      • fetch exceptions
      • fetch list client
      • type safety
      • RAC implementations of wrapHits - new schema, wrapSequenceHits - new schema, bulkCreate - ruleDataClient)
    • Map existing fields onto RAC Rule schema / handle eventCategoryOverride and timestampOverride @madirey
    • RAC exceptions? BaseRuleType fetches exceptions and passes them into executors @madirey
    • Migrations
      • rule types
      • rule statuses? (TODO: maybe acceptable to drop this data)
    • Modify the Threshold Rule type to keep existing alerts state in alerts instance instead of querying alerts index. @madirey
    • Fix threshold cardinality bug [Security Solution][Detections][Threshold Rules] Filtering by Cardinality may miss alerts when bucket count is high #95258 @madirey
    • Update Gap detection logic in Indicator Match rule (Deprioritized in 7.15 CTI planning)
    • Handle eventsTelemetry for Custom Query rule type
  • Research backwards compatibility with .siem-signals index [Security Solution][Detections] Migrating Detections to the new .alerts indices #100103 @marshallmain PR
    • Do we rely on _source in any critical areas in the app? ((1) signals on signals / dupes, (2) Cases - mustache templating, (3) rule actions? -- edge case, but what's the impact?)
  • Rule execution log implementation (only use system indices, users can't build visualizations/dashboards) @xcrzx
    • Integrate with BaseSecurity rule type. Pass in generic status writer. Executors call generic status writer to write warning and errors during execution.
  • RBAC: Observability alerts on alerts concerns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Detection Alerts Security Solution Detection Alerts Feature Meta refactoring Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete v7.13.0
Projects
None yet
Development

No branches or pull requests

4 participants