[Elastic Agent] Add support for EQL based conditions #20994

blakerouse · 2020-09-04T20:20:23Z

What does this PR do?

This adds the ability for a condition to be defined any where in the inputs configuration to make that dictionary conditional on the resulting EQL evaluation. If the evaluation is false the dictionary is removed from the parent type and if its true the dictionary remains.

This implements EQL with the same variable syntax used in input variable substitution ${ .. }. The following is implemented for EQL.

Full PEMDAS math support for + - * / %.
Compares < <= >= > == !=
Booleans true false
and and or
Array functions arrayContains
Dict functions hasKey (not in EQL)
Length functions length
Math functions add, subtract, multiply, divide, modulo.
String functions concat, endsWith, indexOf, match, number, startsWith, string, stringContains.

Why is it important?

To support condition enablement on inputs or even any part of the input configuration. The same conditions can be applied to processors or streams or anything inside of the inputs configuration.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~[ ] I have made corresponding changes to the documentation~~
~~[ ] I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Add a condition to your inputs and see that they are not rendered when the condition fails.

Condition on input

inputs:
  - type: logfile
    streams:
      - paths:
         - /var/log/syslog
    condition: ${host.platform} == 'linux'

Condition on stream

inputs:
  - type: system/metrics
    streams:
      - metricset: load
        data_stream.dataset: system.cpu
        condition: ${host.platform} != 'windows'

Condition on processor

inputs:
  - type: system/metrics
    streams:
      - metricset: load
        data_stream.dataset: system.cpu
        condition: ${host.platform} != 'windows'
    processors:
      - add_fields:
          fields:
            platform: ${host.platform}
          to: host
        condition: ${host.platform} != 'windows'

Related issues

Closes [Elastic Agent] Support EQL condition on an input #20784

elasticmachine · 2020-09-04T20:20:26Z

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine · 2020-09-04T20:45:53Z

💚 Build Succeeded

Expand to view the summary

Build stats

Build Cause: [Pull request #20994 reopened]
Start Time: 2020-09-09T12:59:15.826+0000
Duration: 78 min 42 sec

Test stats 🧪

Test	Results
Failed	0
Passed	20174
Skipped	1837
Total	22011

blakerouse · 2020-09-04T23:36:51Z

This is a very large branch, but I didn't want to break master by landing it in chunks. By refactoring Boolexp to be based on EQL, meant that it caused a lot of changes all across the code base.

I might have a little too much fun as well adding as many features to EQL as possible. Like support arrays and dictionaries inside of the condition logic.

ruflin · 2020-09-07T09:44:08Z

@blakerouse I wonder if we could start with a very small subset of EQL conditions and if this would simplify things? I think we could get started with just the compares, booleans and and/or?

blakerouse · 2020-09-07T12:56:12Z

@ruflin Is the reason to start with a subset is because of the size of the PR? That will not actually make it much smaller, as the diff is misleading in size.

All the code in pkg/eql/parser is auto generated by the antlr4 (same as Boolexp) was. We also use the same EQL logic in the spec files so its needs a good amount of support to work correctly.

The diff size is also large do to the large about of unit tests I have created, to cover all parts of the code. I took the approach in this PR (because of the nature of add EQL), if its not unit tests then it doesn't work.

ruflin · 2020-09-08T07:30:13Z

I'm not really concerned about the PR size but about documenting and having to support all these options. Our users will come up with very creative way on how to combine all these together which one one hand is exciting but in case of bugs can also become "interesting". The part I'm most concerned are the functions as I would rather add one by one on specific feature requests. I definitively can see a use case for startsWith on the docker image case.

Perhaps there is an other way here: Get it in as is but define that we only support the ones that are documented?

blakerouse · 2020-09-08T12:41:05Z

@ruflin That does make sense, I think we can remove some from this PR. But not too many as some are already used/required in the spec files and/or I think they will be critical for docker/kubernetes use case.

math (keep all) add/subtract/multiply/divide/modulo - These all match there operators (being operators were added in 0.8 of EQL, I keep the functions as well just to be like EQL). They share the same language operators code path, so we are not gaining any thing by removing these.
arrayContains (keep) - This is currently not used but I think it will be used by docker/kubernetes. The providers for those will push the labels/annotations as an array. So this will allow the package/user to have a conditional like so:

inputs:
  - type: logfile
    streams:
      - paths:
         - /var/lib/docker/containers/${docker.container.id}/*-json.log
    condition: arrayContains(${docker.container.labels}, 'co.elastic.logs/enabled')

hasKey (must keep) - Already in-use by the spec files to ensure that the output for that specification is correct and that program should be turned on. I also think this will be useful.
length (must keep) - Already in-use by the spec files to ensure that an input exists for that program. Used like so length(${inputs}) > 0.
concat (removable) - Don't really have a use-case in mind at the moment. I think it could be removed. It's also rather simple of just joining strings together.
endsWith (removable) - Currently don't have a use-case for it, but seems useful. If we keep startsWith we should keep endsWith.
indexOf (removable) - No use-case at the moment, idk really when you would need the index of the substring.
match (keep) - I think this will be useful with docker/kubernetes and very powerful in general to perform some regex matching on strings.
number (removable) - No use-case at the moment.
startsWith (removable) - Same as endsWith.
string (keep) - Convert any type into a string, I think this is useful and we should keep it.
stringContains (removable) - Really the same as startsWith or endsWith. I think it could be useful.

@ruflin Let me know which ones you think I should remove.

ruflin · 2020-09-09T12:16:36Z

@blakerouse The list you proposed above SGTM. I would add stringContains/startsWith especially for the docker image name:

condition: startsWith(${docker.image.name}, 'nginx:')

Like this the exact version of the image does not matter. This is often use in the beats conditions today AFAIK.

One function we probably need to add on our end is about comparing versions. But I think @michalpristas has already implemented this.

ruflin · 2020-09-09T12:24:32Z

x-pack/elastic-agent/pkg/eql/Eql.g4

@@ -0,0 +1,97 @@
+// eql.g4


@blakerouse Would be nice to have a link to where this is copied from?

@ruflin I didn't copy it, I wrote it. I added on to the Boolexp.g4 that was already present, I assume originally implemented by @ph

I did the original implementation before we knew about eql.

I've done a bit more digging, endpoint did create a grammar file for antl4r, in https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/eql/src/main/antlr/EqlBase.g4 but this its more and less at the same time.

I don't think we should use that directly.

Good to have this reference. I good in keeping them separate as there are small variations like the variable syntax but it is expected.

ruflin · 2020-09-09T12:28:12Z

x-pack/elastic-agent/spec/endpoint.yml

@@ -57,4 +57,4 @@ rules:
    - output
    - revision

-when: HasAny('fleet') && HasItems(%{[inputs]}) && HasNamespace('output', 'elasticsearch')
+when: length(${fleet}) > 0 and length(${inputs}) > 0 and hasKey(${output}, 'elasticsearch')


I really like that our spec files now also use EQL conditions. Makes it much easier to just have 1 implementation.

mukeshelastic · 2020-09-09T12:56:02Z

Very exciting to see we leveraging EQL for condition definitions than creating something bespoke.. looking forward to test it when it's available.

ruflin

Change LGTM.

I really like that we can reuse the same implementation / logic for processors, inputs, streams and even the internal spec and get rid of the existing boolexp
We need to follow up with docs. Could you file an issue for it so we don't forget.
The reading of the conditions is pretty strict and has good error messages. So it should be pretty easy for users to figure out what they did wrong. In combination with the "simulate" commands for the providers, this will make testing of conditions really simple.

ruflin · 2020-09-09T13:46:08Z

x-pack/elastic-agent/pkg/agent/program/program.go

@@ -199,21 +207,24 @@ func groupByOutputs(single *transpiler.AST) (map[string]*transpiler.AST, error)
 			return nil, fmt.Errorf("unknown configuration output with name %s", targetName)
 		}

-		streams := config[inputsKey].([]map[string]interface{})
+		streams := config.config[inputsKey].([]map[string]interface{})


config.config? :-D

costin · 2020-09-09T14:25:35Z

Folks, I've just learned about this effort through the Observability update email.

I don't want to hijack this PR, rather mention some relevant topics since EQL has been mentioned.

EQL reference

Moving forward EQL on Elasticsearch is the EQL reference not EQL Python. In practice this means certain things that were/are possible in EQL Python are not (and may never be) supported in EQL in ES.
A prime example is arrayContains which does not make sense in ES.
The functionality of EQL on ES is documented here (if something is not in the docs, it does not exist):
https://www.elastic.co/guide/en/elasticsearch/reference/7.x/eql-search.html

string case insensitivity

EQL expects certain things to be case-insensitive which affects both operators (==) and (functions). To what degree we can implement that in Elasticsearch and whether it makes sense is still up for debate.
See elastic/elasticsearch#61883

upcoming grammar changes

EQL currently supports 4 ways to declare strings, something under investigation:
elastic/elasticsearch#61659
Moving forward support for ' is going to be dropped in favor of ". It's still undecided what happens to raw string declarations.

I'd like to sync up and learn more about this effort. Tagging @philkra and @sethpayne

blakerouse · 2020-09-09T14:50:26Z

Folks, I've just learned about this effort through the Observability update email.

I don't want to hijack this PR, rather mention some relevant topics since EQL has been mentioned.

EQL reference

Moving forward EQL on Elasticsearch is the EQL reference not EQL Python. In practice this means certain things that were/are possible in EQL Python are not (and may never be) supported in EQL in ES.
A prime example is arrayContains which does not make sense in ES.
The functionality of EQL on ES is documented here (if something is not in the docs, it does not exist):
https://www.elastic.co/guide/en/elasticsearch/reference/7.x/eql-search.html

Thank you for point this out. Following that documentation is rather confusing, is there an overview page that breaks down the entire language?

I think there will be cases where we need to add some functions that EQL doesn't support. Another one we added, because it was needed was hasKey.

I also think the goal is to be similar to EQL, but not exact. As we also are using variable substitution with ${ .. } to match the variable substitution in input strings.

string case insensitivity

EQL expects certain things to be case-insensitive which affects both operators (==) and (functions). To what degree we can implement that in Elasticsearch and whether it makes sense is still up for debate.
See elastic/elasticsearch#61883

Interesting point about string sensitivity. At the moment this code is case-sensitive.

upcoming grammar changes

EQL currently supports 4 ways to declare strings, something under investigation:
elastic/elasticsearch#61659
Moving forward support for ' is going to be dropped in favor of ". It's still undecided what happens to raw string declarations.

I'd like to sync up and learn more about this effort. Tagging @philkra and @sethpayne

At the moment this code only supports ' and ".

* Refactor Boolexp to Eql. * Connect new Eql to specs and input emitter. * Fix compare with null. * Fix notice and go.mod. (cherry picked from commit af91b5e)

…conditions (#21039) * [Elastic Agent] Add support for EQL based conditions (#20994) * Refactor Boolexp to Eql. * Connect new Eql to specs and input emitter. * Fix compare with null. * Fix notice and go.mod. (cherry picked from commit af91b5e) * Fix go.mod from cherry-pick resolve. * Add changelog.

rw-access · 2020-09-10T13:42:14Z

We also have the EQL in the Endpoint implementation which is ongoing development and will have the full feature set of EQL, unlike Elasticsearch EQL.

RE: hasKey -- I don't understand the new need for this function. Currently we just do some.field != null, and I think we should aim to be consistent.

I think a little more collaboration could help and make sure we're all doing the same thing. There's also a test suite for EQL, and I don't think that's being used here to test against the EQL specification.

I think we all have unique scenarios we are addressing: Elasticsearch is focused on historical search, Endpoint on realtime detection/prevention, and Beats is using expressions to describe filters. For that reason I think it might be a mistake to have one of those teams fully own EQL, but instead collaborate on what is in the specification, and what is an implementation detail.

ph · 2020-09-10T14:47:39Z

@rw-access Totally agree on that.

Concerning hasKey vs some.field I think we should would have to think about it a bit more, I doubt it will make it for 7.10.

Concerning collaboration, would you be open having a "specification" repository where a representative of Endpoint, Elasticsearch, and Ingest management could collaborate, this would be similar to the dissect effort?

rw-access · 2020-09-10T14:58:28Z

Yeah, I think that would be perfect. And our test suite could live there too, instead of in TOML here: https://github.com/endgameinc/eql/tree/master/eql/etc

we pull those files into Elasticsearch here:
https://github.com/elastic/elasticsearch/tree/master/x-pack/plugin/eql/qa/common/src/main/resources
https://github.com/elastic/elasticsearch/tree/master/x-pack/plugin/eql/src/test/resources

ph · 2020-09-10T15:11:08Z

@rw-access can you take the lead on create theses repository, we could revisit that on our end for 7.11.

* upstream/master: (362 commits) Add vendoring to Google Cloud Functions again (elastic#21070) [Elastic Agent] Add fleet.host.id for sending to endpoint. (elastic#21042) Do not need Google credentials before using it (elastic#21072) [Filebeat][New Module] Zoom webhook module (elastic#20414) Add support for GMT timezone offset in decode_cef (elastic#20993) Filebeat: Fix random error on harvester close (elastic#21048) Add ingress controller dashboards (elastic#21052) Fix loggers in composable module. (elastic#21047) [Ingest Manager] Increase kibana client timeout to 5 minutes (elastic#21037) Add changelog. (elastic#21041) [Elastic Agent] Add support for EQL based conditions (elastic#20994) Disable Kafka metricsets based on Jolokia (elastic#20989) Update apm agent (elastic#21031) Add container ECS fields in kubernetes metadata (elastic#20984) Sanitize event.host in Metricbeat (elastic#21022) Update api-keys.asciidoc - API key prerequisites (elastic#21026) [Filebeat][suricata] Map x509 for suricata/eve fileset (elastic#20973) [Filebeat][santa] Map x509 fields in santa module (elastic#20976) [Filebeat][fortinet] Map x509 ecs fields for fortinet fw fileset (elastic#20983) Bump zeek kerberos/ssl/x509 ecs version (elastic#21003) ...

* upstream/master: (364 commits) Add vendoring to Google Cloud Functions again (elastic#21070) [Elastic Agent] Add fleet.host.id for sending to endpoint. (elastic#21042) Do not need Google credentials before using it (elastic#21072) [Filebeat][New Module] Zoom webhook module (elastic#20414) Add support for GMT timezone offset in decode_cef (elastic#20993) Filebeat: Fix random error on harvester close (elastic#21048) Add ingress controller dashboards (elastic#21052) Fix loggers in composable module. (elastic#21047) [Ingest Manager] Increase kibana client timeout to 5 minutes (elastic#21037) Add changelog. (elastic#21041) [Elastic Agent] Add support for EQL based conditions (elastic#20994) Disable Kafka metricsets based on Jolokia (elastic#20989) Update apm agent (elastic#21031) Add container ECS fields in kubernetes metadata (elastic#20984) Sanitize event.host in Metricbeat (elastic#21022) Update api-keys.asciidoc - API key prerequisites (elastic#21026) [Filebeat][suricata] Map x509 for suricata/eve fileset (elastic#20973) [Filebeat][santa] Map x509 fields in santa module (elastic#20976) [Filebeat][fortinet] Map x509 ecs fields for fortinet fw fileset (elastic#20983) Bump zeek kerberos/ssl/x509 ecs version (elastic#21003) ...

blakerouse added 4 commits September 3, 2020 23:47

Refactor Boolexp to Eql.

d50ac33

Connect new Eql to specs and input emitter.

80bc558

Merge branch 'master' into agent-input-condition

7ede2d2

Fix compare with null.

0ce0b24

blakerouse added the Team:Ingest Management label Sep 4, 2020

blakerouse self-assigned this Sep 4, 2020

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Sep 4, 2020

Fix notice and go.mod.

566651f

ph requested a review from ruflin September 8, 2020 19:42

ph added needs_backport PR is waiting to be backported to other branches. review labels Sep 8, 2020

ruflin reviewed Sep 9, 2020

View reviewed changes

blakerouse closed this Sep 9, 2020

blakerouse reopened this Sep 9, 2020

ruflin approved these changes Sep 9, 2020

View reviewed changes

blakerouse merged commit af91b5e into elastic:master Sep 9, 2020

blakerouse deleted the agent-input-condition branch September 9, 2020 14:51

blakerouse mentioned this pull request Sep 9, 2020

Cherry-pick #20994 to 7.x: [Elastic Agent] Add support for EQL based conditions #21039

Merged

4 tasks

blakerouse added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Sep 9, 2020

This was referenced Sep 9, 2020

[Elastic Agent] Add documentation for new input substitution and conditions #21040

Closed

[Elastic Agent] Add changelog entry for EQL based condition #21041

Merged

costin mentioned this pull request Sep 10, 2020

EQL: Revisit case insensitivity elastic/elasticsearch#61883

Closed

ph mentioned this pull request Feb 4, 2021

"error: Metricbeat FAILED" errors under Logs tab on enrolling 7.11 Windows agent with 7.10 default policy on upgraded kibana(7.10.2-7.11.0). #23812

Closed

ph mentioned this pull request Mar 7, 2022

[Elastic-Agent] Replace Agent's hasKey with EQL's some.field != null elastic/elastic-agent#126

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Elastic Agent] Add support for EQL based conditions #20994

[Elastic Agent] Add support for EQL based conditions #20994

blakerouse commented Sep 4, 2020 •

edited

Loading

elasticmachine commented Sep 4, 2020

elasticmachine commented Sep 4, 2020 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

blakerouse commented Sep 4, 2020

ruflin commented Sep 7, 2020

blakerouse commented Sep 7, 2020

ruflin commented Sep 8, 2020

blakerouse commented Sep 8, 2020 •

edited

Loading

ruflin commented Sep 9, 2020

ruflin Sep 9, 2020

blakerouse Sep 9, 2020

ph Sep 9, 2020

ruflin Sep 9, 2020

ruflin Sep 9, 2020

mukeshelastic commented Sep 9, 2020

ruflin left a comment

ruflin Sep 9, 2020

costin commented Sep 9, 2020

blakerouse commented Sep 9, 2020

rw-access commented Sep 10, 2020

ph commented Sep 10, 2020

rw-access commented Sep 10, 2020

ph commented Sep 10, 2020

[Elastic Agent] Add support for EQL based conditions #20994

[Elastic Agent] Add support for EQL based conditions #20994

Conversation

blakerouse commented Sep 4, 2020 • edited Loading

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Condition on input

Condition on stream

Condition on processor

Related issues

elasticmachine commented Sep 4, 2020

elasticmachine commented Sep 4, 2020 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

blakerouse commented Sep 4, 2020

ruflin commented Sep 7, 2020

blakerouse commented Sep 7, 2020

ruflin commented Sep 8, 2020

blakerouse commented Sep 8, 2020 • edited Loading

ruflin commented Sep 9, 2020

ruflin Sep 9, 2020

Choose a reason for hiding this comment

blakerouse Sep 9, 2020

Choose a reason for hiding this comment

ph Sep 9, 2020

Choose a reason for hiding this comment

ruflin Sep 9, 2020

Choose a reason for hiding this comment

ruflin Sep 9, 2020

Choose a reason for hiding this comment

mukeshelastic commented Sep 9, 2020

ruflin left a comment

Choose a reason for hiding this comment

ruflin Sep 9, 2020

Choose a reason for hiding this comment

costin commented Sep 9, 2020

blakerouse commented Sep 9, 2020

rw-access commented Sep 10, 2020

ph commented Sep 10, 2020

rw-access commented Sep 10, 2020

ph commented Sep 10, 2020

blakerouse commented Sep 4, 2020 •

edited

Loading

elasticmachine commented Sep 4, 2020 •

edited by jenkins-beats-ci bot

Loading

blakerouse commented Sep 8, 2020 •

edited

Loading