Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Detection Engine][Rule Suppression] Add Suppression to EQL Non-sequence based queries #176422

Merged
merged 133 commits into from
Apr 17, 2024

Conversation

WafaaNasr
Copy link
Contributor

@WafaaNasr WafaaNasr commented Feb 7, 2024

Summary

Checklist

Related Issues

Screenshots/recordings

Non-Sequence Suppression

  1. Rule creation, Suppression based on a single value
EQL-NonseqRulecreation.mov
  1. Rule creation, Suppression based on an array of values
eqlsuppressionarray.mov
  1. Investigate In Timeline
Eql-investigateInTimeline.mov

Disabled Sequence Suppression

  1. UI
DisableSequenceEqlsuppression.mov

WafaaNasr and others added 5 commits February 7, 2024 14:39
…hema (#176391)

## Summary
- Address schema changes
- Add Tests in rule_converters to validate converting from/to snake case
to/from camel case
@WafaaNasr WafaaNasr added Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. release_note:feature Makes this part of the condensed release notes Team:Detection Engine Security Solution Detection Engine Area labels Feb 7, 2024
@WafaaNasr WafaaNasr self-assigned this Feb 7, 2024
@WafaaNasr
Copy link
Contributor Author

/ci

@WafaaNasr
Copy link
Contributor Author

/ci

@WafaaNasr
Copy link
Contributor Author

/ci

@vitaliidm
Copy link
Contributor

@rylnd

#176422 (comment)
I believe this issue is resolved, now that we've gone back to using a single, non-nested EuiToolTip. Can you please verify this, or provide steps to reproduce?

Can't reproduce it anymore. Thanks for fixing it.

  1. This code would be temporary, and removed once EQL sequence suppression is added
  2. Between the error being rendered on the form, the inability to save/proceed with the form, and the warning presented on the rule preview (as shown [Detection Engine][Rule Suppression] Add Suppression to EQL Non-sequence based queries #176422 (comment)), there's plenty of feedback about the invalid state
  3. There's no negative impact to previewing the rule in that state (since we already added the guards to the executor, where the warning is shown)
  4. The preview/warning is arguably better feedback/UX than a disabled rule preview.

I agree on point 1, it's temporary, so we might just leave it there as it is.
As for the rest, I don't think it gives any new information to user, since warning is essentially the same as error on form.
That rule preview results would be completely irrelevant to the actual rule run, since user won't be able to save that invalid configuration. If they need sequence EQL, they should then remove suppression options.

I'll leave it with you to decide how do you want to proceed here.

Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule Management codeowners changes LGTM 👍

Reviewed the corresponding diffs, but didn't test the PR -- relying on the owners of the feature here.

Really appreciate all the details in the PR description -- it was helpful to get some context before reviewing the change. Thank you to whoever spent their time on this!

@rylnd Just curious what's the strategy for releasing this feature. Since there's a dedicated feature flag, does that mean that after merging this PR it's not going to go live in Serverless next week, and acceptance/exploratory testing will be done + writing the docs, all after the FF?

@@ -46,6 +46,7 @@ export default async function ({ readConfigFile }: FtrConfigProviderContext) {
'--xpack.ruleRegistry.unsafe.legacyMultiTenancy.enabled=true',
`--xpack.securitySolution.enableExperimental=${JSON.stringify([
'alertSuppressionForNewTermsRuleEnabled',
'alertSuppressionForNonSequenceEqlRuleEnabled',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the FF has been activated here, all the tests are going to be affected by it, not only the ones where the FF has specifically added at the top of the spec file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware, this is still the only way to run a cypress test locally with a feature flag enabled (https://github.com/elastic/security-team/issues/7423). Please correct me if that's changed.

The global scope is unfortunate, but we don't really have an alternative. If the flag is not enabled globally, then a developer attempting to run a FF test would see it pass on CI and then fail locally; how would they know to set the global config in order to see the test pass?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right.

If I'm not mistaken that behaviour should be documented so everybody should be aware about it. I understand your point, if you prefer to have it globally then you can remove the FF from the spec files.

My comment was more a reminder about what the behaviour is going to be :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you prefer to have it globally

I would certainly prefer not to set the global configuration, as it's both onerous and broad. But until https://github.com/elastic/security-team/issues/7423 is addressed, it's the only way to develop cypress tests requiring a feature flag.

describe(
'Detection Rule Creation - EQL Rules - With Alert Suppression',
{
tags: ['@ess', '@serverless'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be marked as @serverless. Currently, there is no mechanism to enable through configuration FF on MKI environments, so this test is going to start failing. cc @yctercero

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean @skipInServerless, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can also leave a note or issue reference when we use the skip tags to denote if there's a bug or issue that needs follow up or if it's an intentional skip that does not require any follow up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, our squashing strategy has unfortunately severely reduced the utility of commit messages; the only way to pass along context is through declarative code or code comments :(

describe(
'Detection Rule Creation - EQL Rules - With Alert Suppression - Serverless Essentials License',
{
tags: ['@serverless'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -35,6 +35,7 @@ export default async function ({ readConfigFile }: FtrConfigProviderContext) {
{ product_line: 'cloud', product_tier: 'complete' },
])}`,
`--xpack.securitySolution.enableExperimental=${JSON.stringify([
'alertSuppressionForNonSequenceEqlRuleEnabled',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@logeekal
Copy link
Contributor

I did the desk testing. Functionality seems to be working great. Although, I had one question.

In alert table, we can see one icon for suppressed alerts. But I cannot see that icon in timeline. Is this expected? Please see below video for the demo.

eql_supressed_alerts_timeline_icon.mov

@rylnd
Copy link
Contributor

rylnd commented Apr 16, 2024

@logeekal the behavior you've described looks consistent across all rule types supporting suppression (and was not introduced specifically in this PR). I've created an issue to look into this: #180976.

Copy link
Contributor

@PhilippeOberti PhilippeOberti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving for the Threat Hunting Investigations team because @logeekal did the review and his question was answered!

The name/description indicate that this should contain an index pattern,
but it did not.
@rylnd
Copy link
Contributor

rylnd commented Apr 16, 2024

Just curious what's the strategy for releasing this feature. Since there's a dedicated feature flag, does that mean that after merging this PR it's not going to go live in Serverless next week, and acceptance/exploratory testing will be done + writing the docs, all after the FF?

@banderror that's mostly correct. We plan merge with the FF disabled, and use the next week or so for testing. Docs are already in review so we'll just need to make minor amendments depending on the outcome of testing/release.

At a high level, this PR is relatively low risk as it mainly reuses existing suppression code, and applies it to candidate EQL alerts in a manner nearly identical to several other rule types.

Without a mechanism to enable this feature flag in those environments,
these tests will fail and block operations.
@rylnd
Copy link
Contributor

rylnd commented Apr 16, 2024

@MadameSheema I added the @skipInServerless tag for MKI in 6600836; however it occurred to me that a more declarative @reliesOnFeatureFlag tag might be a better approach here, since that would both a) explicitly tag the test(s) with that dependency, and then b) allow the relevant CI/Ops pipelines to skip those tests (or not) as they see fit. What do you think?

@rylnd rylnd enabled auto-merge (squash) April 16, 2024 22:35
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 17.5MB 17.5MB +1.9KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 80.2KB 80.3KB +54.0B
Unknown metric groups

ESLint disabled line counts

id before after diff
securitySolution 514 515 +1

References to deprecated APIs

id before after diff
securitySolution 534 537 +3

Total ESLint disabled count

id before after diff
securitySolution 591 592 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @rylnd @WafaaNasr

@rylnd rylnd requested a review from MadameSheema April 16, 2024 23:24
@MadameSheema
Copy link
Member

@MadameSheema I added the @skipInServerless tag for MKI in 6600836; however it occurred to me that a more declarative @reliesOnFeatureFlag tag might be a better approach here, since that would both a) explicitly tag the test(s) with that dependency, and then b) allow the relevant CI/Ops pipelines to skip those tests (or not) as they see fit. What do you think

I'm open to add more tags and I personally believe that is a pretty good idea, but there are some objections regarding that.

@yctercero @maximpn thoughts around @rylnd proposal?

Copy link
Member

@MadameSheema MadameSheema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sec-eng-prod changes LGTM! Thanks! :)

@rylnd rylnd merged commit d01a5c4 into main Apr 17, 2024
37 checks passed
@rylnd rylnd deleted the security/eql-suppression branch April 17, 2024 08:38
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Apr 17, 2024
@maximpn
Copy link
Contributor

maximpn commented Apr 17, 2024

@MadameSheema I added the @skipInServerless tag for MKI in 6600836; however it occurred to me that a more declarative @reliesOnFeatureFlag tag might be a better approach here, since that would both a) explicitly tag the test(s) with that dependency, and then b) allow the relevant CI/Ops pipelines to skip those tests (or not) as they see fit. What do you think

I'm open to add more tags and I personally believe that is a pretty good idea, but there are some objections regarding that.

@MadameSheema AFAIK the majority of feature flags are transient though it's matter of months or years until a feature flag is permanently enabled and removed. If we decide to add @reliesOnFeatureFlag flag then someone will need to remove this flag for all related tests.

Additionally @skipInServerless and @reliesOnFeatureFlag have different meaning. @skipInServerless is much broader and it conveys a message in a clear way. The test will be skipped in Serverless (what it exactly means depends on different details but as minimum it's easier to reason about his tag). A test can be skipped when it's broken, flaky or intentionally skipped because it's only a ESS feature or we don't want to run such a test in Serverless. @reliesOnFeatureFlag on the other hand doesn't say anything what will happed to the test.

I'd rather leave it to the test to properly set up the environment to enable proper feature flags than adding a new tag. In a case when a test must be skipped it's should be enough to have @skipInServerless with a comment.

@yctercero WDYT?

@rylnd
Copy link
Contributor

rylnd commented Apr 22, 2024

@maximpn thanks for your feedback!

If we decide to add @reliesOnFeatureFlag flag then someone will need to remove this flag for all related tests.

This is true, and I personally see this behavior as a feature. Consider the alternative: if a test is skipped with skipInServerless due to relying on a feature flag, and that feature flag is then removed, how would you go about finding and unskipping that test? At some point, we need to associate "relies on feature flag" and "skipped in serverless," or else we lose that context until someone takes the time to understand the intention/requirements of that test: that it relied on a feature flag, and that it can now be unskipped.

We can do that with a comment, or a tag; I don't really care how we do it. But seeing skipInServerless does not in itself tell me why a test is skipped, and I need that information if the test is ever going to be unskipped.

@skipInServerless is much broader and it conveys a message in a clear way.

No disagreement here, the tag makes it very clear that the test is skipped. But it tells us nothing about why it was skipped, which is my iissue.

I'd rather leave it to the test to properly set up the environment to enable proper feature flags

I agree, that's the correct solution. What we're trying to address are symptoms caused by inadequate tooling.

@maximpn
Copy link
Contributor

maximpn commented Apr 24, 2024

Hi @rylnd,

I see your point. Do you mind creating a discussion ticket? I think we could assemble all the information there and have more opinions in a centralised way.

@rylnd
Copy link
Contributor

rylnd commented Apr 26, 2024

@maximpn done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:feature Makes this part of the condensed release notes Team:Detection Engine Security Solution Detection Engine Area Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.