Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detection Engine] adds alert suppression to Indicator Match rule #174241

Merged
merged 56 commits into from
Feb 5, 2024

Conversation

vitaliidm
Copy link
Contributor

@vitaliidm vitaliidm commented Jan 4, 2024

Summary

Alert suppression for these rule types is hidden behind feature branch

In this PR implemented:

  • schema changes: allowing alert_suppression object in Indicator match rule type. alert_suppression is identical to existing one for query rule
  • UI changes
  • Cypress tests
  • BE implementation
  • FTR tests

Enabling feature flags

  • alertSuppressionForIndicatorMatchRuleEnabled

Tech implementation details

Alert candidates for IM rule deduplicated first, by searching in existing alerts matched ids.
Once retrieved, alert candidates filtered further, to determine whether they been already suppressed.
It's done by checking each alert candidate suppression time boundaries. If suppression ends earlier than existing alert suppression with the same instance id, alert candidate is removed.

The rest of alert candidates are getting suppressed in memory and either new alerts created or existing updated.
The max limit of created and suppressed alerts is set to 5 * max_signals, which would allow to capture additional threats, should rule execution's alerts number reach max_signals

UI changes

Suppression components in IM rule are identical to Custom Query's

localhost_5601_kbn_app_security_rules_create (2)

UI changes

Checklist

  • Functional changes are hidden behind a feature flag

    Feature flag alertSuppressionForThresholdRuleEnabled

  • Functional changes are covered with a test plan and automated tests.

Test plan PR

    alert_suppression:
      $ref: './common_attributes.schema.yaml#/components/schemas/AlertSuppression'

where

    AlertSuppression:
      type: object
      properties:
        group_by:
          $ref: '#/components/schemas/AlertSuppressionGroupBy'
        duration:
          $ref: '#/components/schemas/AlertSuppressionDuration'
        missing_fields_strategy:
          $ref: '#/components/schemas/AlertSuppressionMissingFieldsStrategy'
      required:
        - group_by

elastic/security-docs#4715

## Summary

Summarize your PR. If it involves visual changes include a screenshot or
gif.


### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
## Summary

Summarize your PR. If it involves visual changes include a screenshot or
gif.


### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
@vitaliidm vitaliidm self-assigned this Jan 4, 2024
## Summary

Summarize your PR. If it involves visual changes include a screenshot or
gif.


### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
…sts fro IM suppression (#174344)

## Summary

Adds tests by test plan

**Scenario: Create rule with per rule execution suppression**

Given rule create page
When user select rule type
And user adds suppress fields
And user selects on rule execution suppression only
And user saves rule
Then on rule details page suppress by fields should be displayed
And suppression on rule execution should be rendered

**Scenario: Create rule with time interval suppression**

Given rule create page
When user select rule type
And user adds suppress fields
And user selects time interval suppression option
And user selects do not suppress missing fields
And user saves rule
Then on rule details page suppress by fields should be displayed
And time interval suppression should be displayed

**Scenario: Edit rule with suppression**

Given rule configured with suppression on rule interval
When user edits rule
And changes suppression from time interval to rule execution suppression
Then on rule details page suppress by fields should be displayed
And per rule execution suppression should be displayed

**Scenario: Edit rule without suppression**

Given rule without suppression
When user edits rule
And user adds suppress fields
And user selects time interval suppression option
And user saves rule
Then on rule details page suppress by fields should be displayed
And time interval suppression should be displayed

**Scenario: Rule details with suppression**

Given rule configured with suppression time interval
When user views rule details page
Then on rule details page suppress by fields should be displayed
And time interval suppression should be displayed

### **License tests**

**Scenario: Create rule with rule execution suppression on basic license
(ESS only)**

Given rule create page
When user select rule type
Then user sees suppression options disabled
And upselling message displayed

**Scenario: Create rule with rule execution suppression on Essentials
tier (Serverless only)**

Given rule create page
When user select rule type
And user adds suppress fields
And user selects on rule execution suppression only
And user saves rule
Then on rule details page suppress by fields should be displayed
And suppression on rule execution should be rendered

**Scenario: Rule details with suppression on basic license  (ESS only)**

Given rule configured with suppression time interval
When user views rule details page
Then on rule details page suppress by fields should be displayed
And time interval suppression should be displayed
And upselling warning that suppression is not applied should be rendered
on details labels
## Summary

Summarize your PR. If it involves visual changes include a screenshot or
gif.


### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
@vitaliidm vitaliidm added Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Engine Security Solution Detection Engine Area 8.13 candidate release_note:feature Makes this part of the condensed release notes labels Jan 12, 2024
vitaliidm and others added 3 commits January 16, 2024 17:19
## Summary

Summarize your PR. If it involves visual changes include a screenshot or
gif.


### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
@vitaliidm vitaliidm changed the title [Security Solution][Detection Engine] adds alert suppression to IM and EQL rules [Security Solution][Detection Engine] adds alert suppression to Indicator Match rule Jan 17, 2024
Copy link
Member

@MadameSheema MadameSheema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-engineering-productivity changes LGTM. Thanks!! :)

Copy link
Contributor

@e40pud e40pud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left a few minor comments and questions. I will need a bit more time to do local testing, but in general code LGTM

suppressedAlerts.push(alert);
return false;
} else {
idsMap[instanceId] = { count: suppressionDocsCount, suppressionEnd };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to understand what instanceId == null means and in what case that can happen? Also, what does that mean in terms of updating count and suppressionEnd values?

Based on lines 177-184, we would leave such alerts as it is, meaning we are not going to do suppression for them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the list of alerts get sorted first by supression start https://github.com/elastic/kibana/pull/174241/files#diff-fa438b70a9910ed86557d3e5bc51ed29dfefbcf746d86d8fd08c203ab7251f2eR155

Afterwards, we are traversing array and suppressing alerts based on instance_id.
when particular instance_id is not yet in map(instanceId == null), that is the first alert, with the smallest suppression_end(it is equals to suppression_start at this moment). We take it as a basis, value of suppression_end and docs_count.
In next iteration, if we encounter alert with the same instanceId, we increase docs_count(means we count that alert as suppressed) and changing suppression_end, so the correct interval of suppression is recorded

@@ -257,7 +406,7 @@ export const createPersistenceRuleTypeWrapper: CreatePersistenceRuleTypeWrapper
},
{
terms: {
[ALERT_INSTANCE_ID]: alerts.map(
[ALERT_INSTANCE_ID]: filteredDuplicates.map(
(alert) => alert._source['kibana.alert.instance.id']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use ALERT_INSTANCE_ID here as well?

* alerts returned from BE have date type coerced to ISO strings
*/
export type BackendAlertWithSuppressionFields870<T> = Omit<
AlertWithSuppressionFields870<T>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use 8.13. version here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e40pud , thanks for review. Have addressed the feedback

there is an explanation that was there from the initial implementation.
I moved it to the redefined type, so it is more visible
e4239b6#diff-fa438b70a9910ed86557d3e5bc51ed29dfefbcf746d86d8fd08c203ab7251f2eL439-L441

enrichedEvents,
toReturn,
}) => {
// max signals for suppression includes suppressed and created alerts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could be more reusable if it just deals with the suppression part and lets the caller handle alert creation. The utility currently takes many parameters for searchAfterAndBulkCreateFactory, which is also called in the search_after_bulk_create.ts.

By including only the suppression-specific parameters, we can use this utility in various executors. Additionally, certain parts of this utility's code can be shared with the search_after_bulk_create.

Copy link
Contributor Author

@vitaliidm vitaliidm Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppression logic is tightly interconnected with alert creation.
In this executor bulkCreate is called to create non suppressed alerts with missing fields, and alerts are suppressed and created in bulkCreateWithSuppression. Then there is a logic specific to alert truncation, that can't be handled out of scope of alert creation.
So, code in this executor is specific for alert suppression only. All common logic between searchAfterAndBulkCreateSuppressedAlerts and searchAfterAndBulkCreate is implemented within searchAfterAndBulkCreateFactory. While, suppression or non-suppression alert creation handled in corresponding executor.

In future, executor might be extracted as a separate utility, when it will be needed only for suppression/creation, without exhaustive search. For example, for EQL case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WafaaNasr it sounds like this could be very helpful to allow us to better unit test and resuse the functionality. Could we create a ticket to track this improvement and perhaps include it with the EQL alert suppression work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vitalii and Yara!
We talked about this matter and agreed that Vitalli will focus on extracting the logic here to make it reusable for other rule types in a different PR

vitaliidm and others added 3 commits February 2, 2024 15:19
Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
@vitaliidm vitaliidm requested a review from e40pud February 2, 2024 15:39
@vitaliidm vitaliidm requested a review from WafaaNasr February 2, 2024 16:41
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResponseOps changes LGTM

Copy link
Contributor

@e40pud e40pud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and did not find any issues. LGTM!

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
securitySolution 4951 4952 +1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
ruleRegistry 239 243 +4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 11.4MB 11.4MB +1.7KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 69.7KB 69.9KB +135.0B
Unknown metric groups

API count

id before after diff
ruleRegistry 268 272 +4

References to deprecated APIs

id before after diff
securitySolution 519 521 +2

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @vitaliidm

Copy link
Contributor

@WafaaNasr WafaaNasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks Vitalii for addressing the comments!

@vitaliidm vitaliidm merged commit 6d5a485 into main Feb 5, 2024
37 checks passed
@vitaliidm vitaliidm deleted the security/alert-suppression-im-eql branch February 5, 2024 15:54
@kibanamachine kibanamachine added v8.13.0 backport:skip This commit does not require backporting labels Feb 5, 2024
fkanout pushed a commit to fkanout/kibana that referenced this pull request Feb 7, 2024
…ator Match rule (elastic#174241)

## Summary

- addresses elastic/security-team#7773 Epic
- addresses elastic/security-team#8360

Alert suppression for these rule types is hidden behind feature branch

In this PR implemented:

- schema changes: allowing `alert_suppression` object in Indicator match
rule type. `alert_suppression` is identical to existing one for query
rule
- UI changes
- Cypress tests
- BE implementation
- FTR tests

Enabling feature flags

 - `alertSuppressionForIndicatorMatchRuleEnabled`


### Tech implementation details

Alert candidates for IM rule deduplicated first, by searching in
existing alerts matched ids.
Once retrieved, alert candidates filtered further, to determine whether
they been already suppressed.
It's done by checking each alert candidate suppression time boundaries.
If suppression ends earlier than existing alert suppression with the
same instance id, alert candidate is removed.

The rest of alert candidates are getting suppressed in memory and either
new alerts created or existing updated.
The max limit of created and suppressed alerts is set to `5 *
max_signals`, which would allow to capture additional threats, should
rule execution's alerts number reach max_signals

### UI changes

Suppression components in IM rule are identical to Custom Query's

![localhost_5601_kbn_app_security_rules_create
(2)](https://github.com/elastic/kibana/assets/92328789/b79db59a-9369-4ef0-af4a-c0ca0bb3d25d)


### UI changes

### Checklist
- [x] Functional changes are hidden behind a feature flag 

  Feature flag `alertSuppressionForThresholdRuleEnabled`

- [x] Functional changes are covered with a test plan and automated
tests.

[Test plan PR](elastic/security-team#8390)

- [x] Stability of new and changed tests is verified using the [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner).

[FTR ESS & Serverless tests]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4972
[Cypress ESS]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4970
[Cypress Serverless]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4971


- [ ] Comprehensive manual testing is done by two engineers: the PR
author and one of the PR reviewers. Changes are tested in both ESS and
Serverless.

- [x] Mapping changes are accompanied by a technical design document. It
can be a GitHub issue or an RFC explaining the changes. The design
document is shared with and approved by the appropriate teams and
individual stakeholders.

Existing AlertSuppression schema field is used for IM rule, the one that
used for Query rule.

```yml
    alert_suppression:
      $ref: './common_attributes.schema.yaml#/components/schemas/AlertSuppression'
```
where

```yml
    AlertSuppression:
      type: object
      properties:
        group_by:
          $ref: '#/components/schemas/AlertSuppressionGroupBy'
        duration:
          $ref: '#/components/schemas/AlertSuppressionDuration'
        missing_fields_strategy:
          $ref: '#/components/schemas/AlertSuppressionMissingFieldsStrategy'
      required:
        - group_by
   ```

- [x]  Functional changes are communicated to the Docs team. A ticket or PR is opened in https://github.com/elastic/security-docs. The following information is included: any feature flags used, affected environments (Serverless, ESS, or both).

elastic/security-docs#4715

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Wafaa Nasr <wafaa.nasr@elastic.co>
Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
vitaliidm added a commit that referenced this pull request Feb 8, 2024
…order of search to asc (#176321)

## Summary

Sets search of documents for IM rule type from `desc` to `asc` when
suppression is enabled.
Also would allow to fix corner cases around [alert
suppression](#174241).
Alert suppression in IM rule relies on correct suppression time
boundaries to correctly deduplicate earlier suppressed alerts. I.e, if
document start suppression time(document timestamp) falls within
suppression boundaries, it means, alert was already suppressed. So, we
can exclude it from suppression as already suppressed and not to count
it twice.

But because documents for IM rule are searched in reverse order, it is
possible, while processing a second page of results, to falsely count
alert as already suppressed and discard it from suppressed count. That's
because its timestamp is older than document's timestamp from the first
page.
Newly added test failed only for code execution path, when number of
events is greater than number of threats.
It is because, events are split in chunks by 9,000 first. So if reverse
order in that case would cause alert from next batches to be dropped as
already suppressed

Setting `asc` can potentially affect IM rule performance, when events
need to be searched first and rule is configured with the large
look-back time. That's why new order is set to tech preview alert
suppression feature only

---------

Co-authored-by: Ryland Herrick <ryalnd@gmail.com>
CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this pull request Feb 15, 2024
…ator Match rule (elastic#174241)

## Summary

- addresses elastic/security-team#7773 Epic
- addresses elastic/security-team#8360

Alert suppression for these rule types is hidden behind feature branch

In this PR implemented:

- schema changes: allowing `alert_suppression` object in Indicator match
rule type. `alert_suppression` is identical to existing one for query
rule
- UI changes
- Cypress tests
- BE implementation
- FTR tests

Enabling feature flags

 - `alertSuppressionForIndicatorMatchRuleEnabled`


### Tech implementation details

Alert candidates for IM rule deduplicated first, by searching in
existing alerts matched ids.
Once retrieved, alert candidates filtered further, to determine whether
they been already suppressed.
It's done by checking each alert candidate suppression time boundaries.
If suppression ends earlier than existing alert suppression with the
same instance id, alert candidate is removed.

The rest of alert candidates are getting suppressed in memory and either
new alerts created or existing updated.
The max limit of created and suppressed alerts is set to `5 *
max_signals`, which would allow to capture additional threats, should
rule execution's alerts number reach max_signals

### UI changes

Suppression components in IM rule are identical to Custom Query's

![localhost_5601_kbn_app_security_rules_create
(2)](https://github.com/elastic/kibana/assets/92328789/b79db59a-9369-4ef0-af4a-c0ca0bb3d25d)


### UI changes

### Checklist
- [x] Functional changes are hidden behind a feature flag 

  Feature flag `alertSuppressionForThresholdRuleEnabled`

- [x] Functional changes are covered with a test plan and automated
tests.

[Test plan PR](elastic/security-team#8390)

- [x] Stability of new and changed tests is verified using the [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner).

[FTR ESS & Serverless tests]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4972
[Cypress ESS]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4970
[Cypress Serverless]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4971


- [ ] Comprehensive manual testing is done by two engineers: the PR
author and one of the PR reviewers. Changes are tested in both ESS and
Serverless.

- [x] Mapping changes are accompanied by a technical design document. It
can be a GitHub issue or an RFC explaining the changes. The design
document is shared with and approved by the appropriate teams and
individual stakeholders.

Existing AlertSuppression schema field is used for IM rule, the one that
used for Query rule.

```yml
    alert_suppression:
      $ref: './common_attributes.schema.yaml#/components/schemas/AlertSuppression'
```
where

```yml
    AlertSuppression:
      type: object
      properties:
        group_by:
          $ref: '#/components/schemas/AlertSuppressionGroupBy'
        duration:
          $ref: '#/components/schemas/AlertSuppressionDuration'
        missing_fields_strategy:
          $ref: '#/components/schemas/AlertSuppressionMissingFieldsStrategy'
      required:
        - group_by
   ```

- [x]  Functional changes are communicated to the Docs team. A ticket or PR is opened in https://github.com/elastic/security-docs. The following information is included: any feature flags used, affected environments (Serverless, ESS, or both).

elastic/security-docs#4715

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Wafaa Nasr <wafaa.nasr@elastic.co>
Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this pull request Feb 15, 2024
…order of search to asc (elastic#176321)

## Summary

Sets search of documents for IM rule type from `desc` to `asc` when
suppression is enabled.
Also would allow to fix corner cases around [alert
suppression](elastic#174241).
Alert suppression in IM rule relies on correct suppression time
boundaries to correctly deduplicate earlier suppressed alerts. I.e, if
document start suppression time(document timestamp) falls within
suppression boundaries, it means, alert was already suppressed. So, we
can exclude it from suppression as already suppressed and not to count
it twice.

But because documents for IM rule are searched in reverse order, it is
possible, while processing a second page of results, to falsely count
alert as already suppressed and discard it from suppressed count. That's
because its timestamp is older than document's timestamp from the first
page.
Newly added test failed only for code execution path, when number of
events is greater than number of threats.
It is because, events are split in chunks by 9,000 first. So if reverse
order in that case would cause alert from next batches to be dropped as
already suppressed

Setting `asc` can potentially affect IM rule performance, when events
need to be searched first and rule is configured with the large
look-back time. That's why new order is set to tech preview alert
suppression feature only

---------

Co-authored-by: Ryland Herrick <ryalnd@gmail.com>
fkanout pushed a commit to fkanout/kibana that referenced this pull request Mar 4, 2024
…ator Match rule (elastic#174241)

## Summary

- addresses elastic/security-team#7773 Epic
- addresses elastic/security-team#8360

Alert suppression for these rule types is hidden behind feature branch

In this PR implemented:

- schema changes: allowing `alert_suppression` object in Indicator match
rule type. `alert_suppression` is identical to existing one for query
rule
- UI changes
- Cypress tests
- BE implementation
- FTR tests

Enabling feature flags

 - `alertSuppressionForIndicatorMatchRuleEnabled`


### Tech implementation details

Alert candidates for IM rule deduplicated first, by searching in
existing alerts matched ids.
Once retrieved, alert candidates filtered further, to determine whether
they been already suppressed.
It's done by checking each alert candidate suppression time boundaries.
If suppression ends earlier than existing alert suppression with the
same instance id, alert candidate is removed.

The rest of alert candidates are getting suppressed in memory and either
new alerts created or existing updated.
The max limit of created and suppressed alerts is set to `5 *
max_signals`, which would allow to capture additional threats, should
rule execution's alerts number reach max_signals

### UI changes

Suppression components in IM rule are identical to Custom Query's

![localhost_5601_kbn_app_security_rules_create
(2)](https://github.com/elastic/kibana/assets/92328789/b79db59a-9369-4ef0-af4a-c0ca0bb3d25d)


### UI changes

### Checklist
- [x] Functional changes are hidden behind a feature flag 

  Feature flag `alertSuppressionForThresholdRuleEnabled`

- [x] Functional changes are covered with a test plan and automated
tests.

[Test plan PR](elastic/security-team#8390)

- [x] Stability of new and changed tests is verified using the [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner).

[FTR ESS & Serverless tests]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4972
[Cypress ESS]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4970
[Cypress Serverless]
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/4971


- [ ] Comprehensive manual testing is done by two engineers: the PR
author and one of the PR reviewers. Changes are tested in both ESS and
Serverless.

- [x] Mapping changes are accompanied by a technical design document. It
can be a GitHub issue or an RFC explaining the changes. The design
document is shared with and approved by the appropriate teams and
individual stakeholders.

Existing AlertSuppression schema field is used for IM rule, the one that
used for Query rule.

```yml
    alert_suppression:
      $ref: './common_attributes.schema.yaml#/components/schemas/AlertSuppression'
```
where

```yml
    AlertSuppression:
      type: object
      properties:
        group_by:
          $ref: '#/components/schemas/AlertSuppressionGroupBy'
        duration:
          $ref: '#/components/schemas/AlertSuppressionDuration'
        missing_fields_strategy:
          $ref: '#/components/schemas/AlertSuppressionMissingFieldsStrategy'
      required:
        - group_by
   ```

- [x]  Functional changes are communicated to the Docs team. A ticket or PR is opened in https://github.com/elastic/security-docs. The following information is included: any feature flags used, affected environments (Serverless, ESS, or both).

elastic/security-docs#4715

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Wafaa Nasr <wafaa.nasr@elastic.co>
Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
fkanout pushed a commit to fkanout/kibana that referenced this pull request Mar 4, 2024
…order of search to asc (elastic#176321)

## Summary

Sets search of documents for IM rule type from `desc` to `asc` when
suppression is enabled.
Also would allow to fix corner cases around [alert
suppression](elastic#174241).
Alert suppression in IM rule relies on correct suppression time
boundaries to correctly deduplicate earlier suppressed alerts. I.e, if
document start suppression time(document timestamp) falls within
suppression boundaries, it means, alert was already suppressed. So, we
can exclude it from suppression as already suppressed and not to count
it twice.

But because documents for IM rule are searched in reverse order, it is
possible, while processing a second page of results, to falsely count
alert as already suppressed and discard it from suppressed count. That's
because its timestamp is older than document's timestamp from the first
page.
Newly added test failed only for code execution path, when number of
events is greater than number of threats.
It is because, events are split in chunks by 9,000 first. So if reverse
order in that case would cause alert from next batches to be dropped as
already suppressed

Setting `asc` can potentially affect IM rule performance, when events
need to be searched first and rule is configured with the large
look-back time. That's why new order is set to tech preview alert
suppression feature only

---------

Co-authored-by: Ryland Herrick <ryalnd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.13 candidate backport:skip This commit does not require backporting release_note:feature Makes this part of the condensed release notes Team:Detection Engine Security Solution Detection Engine Area Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants