-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic Agent] Add support for EQL based conditions #20994
Conversation
Pinging @elastic/ingest-management (Team:Ingest Management) |
This is a very large branch, but I didn't want to break master by landing it in chunks. By refactoring Boolexp to be based on EQL, meant that it caused a lot of changes all across the code base. I might have a little too much fun as well adding as many features to EQL as possible. Like support arrays and dictionaries inside of the condition logic. |
@blakerouse I wonder if we could start with a very small subset of EQL conditions and if this would simplify things? I think we could get started with just the compares, booleans and |
@ruflin Is the reason to start with a subset is because of the size of the PR? That will not actually make it much smaller, as the diff is misleading in size. All the code in The diff size is also large do to the large about of unit tests I have created, to cover all parts of the code. I took the approach in this PR (because of the nature of add EQL), if its not unit tests then it doesn't work. |
I'm not really concerned about the PR size but about documenting and having to support all these options. Our users will come up with very creative way on how to combine all these together which one one hand is exciting but in case of bugs can also become "interesting". The part I'm most concerned are the functions as I would rather add one by one on specific feature requests. I definitively can see a use case for startsWith on the docker image case. Perhaps there is an other way here: Get it in as is but define that we only support the ones that are documented? |
@ruflin That does make sense, I think we can remove some from this PR. But not too many as some are already used/required in the spec files and/or I think they will be critical for docker/kubernetes use case.
@ruflin Let me know which ones you think I should remove. |
@blakerouse The list you proposed above SGTM. I would add
Like this the exact version of the image does not matter. This is often use in the beats conditions today AFAIK. One function we probably need to add on our end is about comparing versions. But I think @michalpristas has already implemented this. |
@@ -0,0 +1,97 @@ | |||
// eql.g4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blakerouse Would be nice to have a link to where this is copied from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the original implementation before we knew about eql.
I've done a bit more digging, endpoint did create a grammar file for antl4r, in https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/eql/src/main/antlr/EqlBase.g4 but this its more and less at the same time.
I don't think we should use that directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to have this reference. I good in keeping them separate as there are small variations like the variable syntax but it is expected.
@@ -57,4 +57,4 @@ rules: | |||
- output | |||
- revision | |||
|
|||
when: HasAny('fleet') && HasItems(%{[inputs]}) && HasNamespace('output', 'elasticsearch') | |||
when: length(${fleet}) > 0 and length(${inputs}) > 0 and hasKey(${output}, 'elasticsearch') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like that our spec files now also use EQL conditions. Makes it much easier to just have 1 implementation.
Very exciting to see we leveraging EQL for condition definitions than creating something bespoke.. looking forward to test it when it's available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change LGTM.
- I really like that we can reuse the same implementation / logic for processors, inputs, streams and even the internal spec and get rid of the existing boolexp
- We need to follow up with docs. Could you file an issue for it so we don't forget.
- The reading of the conditions is pretty strict and has good error messages. So it should be pretty easy for users to figure out what they did wrong. In combination with the "simulate" commands for the providers, this will make testing of conditions really simple.
@@ -199,21 +207,24 @@ func groupByOutputs(single *transpiler.AST) (map[string]*transpiler.AST, error) | |||
return nil, fmt.Errorf("unknown configuration output with name %s", targetName) | |||
} | |||
|
|||
streams := config[inputsKey].([]map[string]interface{}) | |||
streams := config.config[inputsKey].([]map[string]interface{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config.config? :-D
Folks, I've just learned about this effort through the Observability update email. I don't want to hijack this PR, rather mention some relevant topics since EQL has been mentioned.
Moving forward EQL on Elasticsearch is the EQL reference not EQL Python. In practice this means certain things that were/are possible in EQL Python are not (and may never be) supported in EQL in ES.
EQL expects certain things to be case-insensitive which affects both operators (
EQL currently supports 4 ways to declare strings, something under investigation: I'd like to sync up and learn more about this effort. Tagging @philkra and @sethpayne |
Thank you for point this out. Following that documentation is rather confusing, is there an overview page that breaks down the entire language? I think there will be cases where we need to add some functions that EQL doesn't support. Another one we added, because it was needed was I also think the goal is to be similar to EQL, but not exact. As we also are using variable substitution with
Interesting point about string sensitivity. At the moment this code is case-sensitive.
At the moment this code only supports |
* Refactor Boolexp to Eql. * Connect new Eql to specs and input emitter. * Fix compare with null. * Fix notice and go.mod. (cherry picked from commit af91b5e)
We also have the EQL in the Endpoint implementation which is ongoing development and will have the full feature set of EQL, unlike Elasticsearch EQL. RE: hasKey -- I don't understand the new need for this function. Currently we just do I think a little more collaboration could help and make sure we're all doing the same thing. There's also a test suite for EQL, and I don't think that's being used here to test against the EQL specification. I think we all have unique scenarios we are addressing: Elasticsearch is focused on historical search, Endpoint on realtime detection/prevention, and Beats is using expressions to describe filters. For that reason I think it might be a mistake to have one of those teams fully own EQL, but instead collaborate on what is in the specification, and what is an implementation detail. |
@rw-access Totally agree on that. Concerning Concerning collaboration, would you be open having a "specification" repository where a representative of Endpoint, Elasticsearch, and Ingest management could collaborate, this would be similar to the dissect effort? |
Yeah, I think that would be perfect. And our test suite could live there too, instead of in TOML here: https://github.com/endgameinc/eql/tree/master/eql/etc we pull those files into Elasticsearch here: |
@rw-access can you take the lead on create theses repository, we could revisit that on our end for 7.11. |
* upstream/master: (362 commits) Add vendoring to Google Cloud Functions again (elastic#21070) [Elastic Agent] Add fleet.host.id for sending to endpoint. (elastic#21042) Do not need Google credentials before using it (elastic#21072) [Filebeat][New Module] Zoom webhook module (elastic#20414) Add support for GMT timezone offset in decode_cef (elastic#20993) Filebeat: Fix random error on harvester close (elastic#21048) Add ingress controller dashboards (elastic#21052) Fix loggers in composable module. (elastic#21047) [Ingest Manager] Increase kibana client timeout to 5 minutes (elastic#21037) Add changelog. (elastic#21041) [Elastic Agent] Add support for EQL based conditions (elastic#20994) Disable Kafka metricsets based on Jolokia (elastic#20989) Update apm agent (elastic#21031) Add container ECS fields in kubernetes metadata (elastic#20984) Sanitize event.host in Metricbeat (elastic#21022) Update api-keys.asciidoc - API key prerequisites (elastic#21026) [Filebeat][suricata] Map x509 for suricata/eve fileset (elastic#20973) [Filebeat][santa] Map x509 fields in santa module (elastic#20976) [Filebeat][fortinet] Map x509 ecs fields for fortinet fw fileset (elastic#20983) Bump zeek kerberos/ssl/x509 ecs version (elastic#21003) ...
* upstream/master: (364 commits) Add vendoring to Google Cloud Functions again (elastic#21070) [Elastic Agent] Add fleet.host.id for sending to endpoint. (elastic#21042) Do not need Google credentials before using it (elastic#21072) [Filebeat][New Module] Zoom webhook module (elastic#20414) Add support for GMT timezone offset in decode_cef (elastic#20993) Filebeat: Fix random error on harvester close (elastic#21048) Add ingress controller dashboards (elastic#21052) Fix loggers in composable module. (elastic#21047) [Ingest Manager] Increase kibana client timeout to 5 minutes (elastic#21037) Add changelog. (elastic#21041) [Elastic Agent] Add support for EQL based conditions (elastic#20994) Disable Kafka metricsets based on Jolokia (elastic#20989) Update apm agent (elastic#21031) Add container ECS fields in kubernetes metadata (elastic#20984) Sanitize event.host in Metricbeat (elastic#21022) Update api-keys.asciidoc - API key prerequisites (elastic#21026) [Filebeat][suricata] Map x509 for suricata/eve fileset (elastic#20973) [Filebeat][santa] Map x509 fields in santa module (elastic#20976) [Filebeat][fortinet] Map x509 ecs fields for fortinet fw fileset (elastic#20983) Bump zeek kerberos/ssl/x509 ecs version (elastic#21003) ...
What does this PR do?
This adds the ability for a condition to be defined any where in the inputs configuration to make that dictionary conditional on the resulting EQL evaluation. If the evaluation is false the dictionary is removed from the parent type and if its true the dictionary remains.
This implements EQL with the same variable syntax used in input variable substitution
${ .. }
. The following is implemented for EQL.+ - * / %
.< <= >= > == !=
true
false
and
andor
arrayContains
hasKey
(not in EQL)length
add
,subtract
,multiply
,divide
,modulo
.concat
,endsWith
,indexOf
,match
,number
,startsWith
,string
,stringContains
.Why is it important?
To support condition enablement on inputs or even any part of the input configuration. The same conditions can be applied to processors or streams or anything inside of the inputs configuration.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Add a
condition
to your inputs and see that they are not rendered when the condition fails.Condition on input
Condition on stream
Condition on processor
Related issues