-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable host fields for "cloud", panw, cef modules #18223
Disable host fields for "cloud", panw, cef modules #18223
Conversation
54aae77
to
8b659eb
Compare
💔 Build FailedExpand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures
Steps errorsExpand to view the steps failures
Log outputExpand to view the last 100 lines of log output
|
8b659eb
to
88556a4
Compare
Pinging @elastic/siem (Team:SIEM) |
This changes the default configuration of Filebeat to not add `host` fields to events that originated in other places. The `host` field is defined in ECS as "host on which the event happened" but for data pulled from cloud APIs for data forwarded to Filebeat from other sources (PANW, CEF) this `host` field is inaccurate. The affected "cloud" modules are azure, aws, googlecloud, o365, and okta. By default they will tag events with `cloud`. This cause the module to not add `host.name` at the input state. And then the default configuration for Filebeat was updated to add a `when` condition to the `add_host_metadata` processors to skip events containing the `cloud` tag. For PANW and CEF when data is forwarded to Filebeat from another host/device (this is most of the time) you don't want Filebeat to add `host`. So by default this modules add a `forwarded` tag to events that behaves the same as the `cloud` tag. If you configure the module to not include the `forwarded` tag (e.g. `var.tags: [my_tag]`) then Filebeat will add the `host.*` fields. And for PANW I added some additional static `observer.*` fields. Relates: elastic#13920
88556a4
to
f8fa355
Compare
@@ -4,6 +4,9 @@ paths: | |||
- {{$path}} | |||
{{ end }} | |||
exclude_files: [".gz$"] | |||
tags: {{.tags | tojson}} | |||
publisher_pipeline.disable_host: {{ inList .tags "cloud" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall makes sense to me, but just wondering--shouldn't this follow the same logic as the add_host_metadata
processor (check cloud
|| forwarded
)? Just wondering since it'd mean in 8.x we could collapse this down and just standardize on "tag with forwarded to avoid the host enrichment"
@@ -13,6 +13,8 @@ var: | |||
- name: secret_access_key | |||
- name: session_token | |||
- name: role_arn | |||
- name: tags | |||
default: [cloud] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, same here, i'm thinking we should probably just tag all the resources for cloud
with forwarded
too, no?
I'm wondering if we need "cloud" and "forwarded", it feels odd to have 2 flags that will do the same thing. Could we have one tag "disable_host_metadata"? And then the processor template would be:
|
Thanks for the feedback @leehinman @andrewstucki . It sounds like there is some consensus that it would be more clear to have a single tag for indicating that the data that did not originate on "this Filebeat host". That leaves the question of what that one tag should be. I like "forwarded" more than "disable_host_metadata" because it has some meaning w.r.t. the event whereas "disable_host_metadata" is more about how Filebeat should handle the data. If the tag were not going to be indexed in Elasticsearch then "disable_host_metadata" would be great, but I think in most cases it will end up in ES. There's no good way to guarantee that it gets dropped from the event. We could drop it in our default config but the module user might not have our default config. Besides that, we don't have a "drop_tag" processor. So should we used |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…able-host-cloud-panw-cef
💚 Build SucceededExpand to view the summary
Build stats
Test stats 🧪
|
…w-oss * upstream/master: (27 commits) Disable host fields for "cloud", panw, cef modules (elastic#18223) [docs] Rename monitoring collection from legacy internal collection to legacy collection (elastic#18504) Introduce auto detection of format (elastic#18095) Add additional fields to address issue elastic#18465 for googlecloud audit log (elastic#18472) Fix libbeat import path in seccomp policy template (elastic#18418) Address Okta input issue elastic#18530 (elastic#18534) [Ingest Manager] Avoid Chown on windows (elastic#18512) Fix Cisco ASA/FTD msgs that use a host name as NAT address (elastic#18376) [CI] Optimise stash/unstash performance (elastic#18473) Libbeat: Remove global loggers from libbeat/metric and libbeat/cloudid (elastic#18500) Fix PANW bad mapping of client/source and server/dest packets and bytes (elastic#18525) Add a file lock to the data directory on startup to prevent multiple agents. (elastic#18483) Followup to 12606 (elastic#18316) changed input from syslog to tcp/udp due to unsupported RFC (elastic#18447) Improve ECS field mappings in Sysmon module. (elastic#18381) [Elastic Agent] Cleaner output of inspect command (elastic#18405) [Elastic Agent] Pick up version from libbeat (elastic#18350) Update communitybeats.asciidoc (elastic#18470) [Metricbeat] Change visualization interval from 15m to >=15m (elastic#18466) docs: Fix typo in kerberos docs (elastic#18503) ...
This changes the default configuration of Filebeat to not add `host` fields to events that originated in other places. The `host` field is defined in ECS as "host on which the event happened" but for data pulled from cloud APIs for data forwarded to Filebeat from other sources (PANW, CEF) this `host` field is inaccurate. The affected "cloud" modules are azure, aws, googlecloud, o365, and okta. By default they will tag events with `forwarded`. This will cause the module to not add `host.name` at the input state. And then the default configuration for Filebeat was updated to add a `when` condition to the `add_host_metadata` processors to skip events containing the `forwarded` tag. For PANW and CEF when data is forwarded to Filebeat from another host/device (this is most of the time) you don't want Filebeat to add `host`. So by default this modules add a `forwarded` tag to events. If you configure the module to not include the `forwarded` tag (e.g. `var.tags: [my_tag]`) then Filebeat will add the `host.*` fields. And for PANW I added some additional static `observer.*` fields. Relates: elastic#13920 (cherry picked from commit e990740)
… modules (#19074) This changes the default configuration of Filebeat to not add `host` fields to events that originated in other places. The `host` field is defined in ECS as "host on which the event happened" but for data pulled from cloud APIs for data forwarded to Filebeat from other sources (PANW, CEF) this `host` field is inaccurate. The affected "cloud" modules are azure, aws, googlecloud, o365, and okta. By default they will tag events with `forwarded`. This will cause the module to not add `host.name` at the input state. And then the default configuration for Filebeat was updated to add a `when` condition to the `add_host_metadata` processors to skip events containing the `forwarded` tag. For PANW and CEF when data is forwarded to Filebeat from another host/device (this is most of the time) you don't want Filebeat to add `host`. So by default this modules add a `forwarded` tag to events. If you configure the module to not include the `forwarded` tag (e.g. `var.tags: [my_tag]`) then Filebeat will add the `host.*` fields. And for PANW I added some additional static `observer.*` fields. Relates: #13920 (cherry picked from commit e990740)
What does this PR do?
This changes the default configuration of Filebeat to not add
host
fields to events thatoriginated in other places. The
host
field is defined in ECS as "host on which the event happened" but for data pulled from cloud APIs for data forwarded to Filebeat from other sources (PANW, CEF) thishost
field is inaccurate.The affected "cloud" modules are azure, aws, googlecloud, o365, and okta. By default they will
tag events with
cloud
. This causes the module to not addhost.name
at the input state. And then the default configuration for Filebeat was updated to add awhen
condition to theadd_host_metadata
processors to skip events containing thecloud
tag.For PANW and CEF when data is forwarded to Filebeat from another host/device (this is most of the time) you don't want Filebeat to add
host
. So by default this modules add aforwarded
tag to events that behaves the same as thecloud
tag. If you configure the module to not include theforwarded
tag (e.g.var.tags: [my_tag]
) then Filebeat will add thehost.*
fields.And for PANW I added some additional static
observer.*
fields.Why is it important?
We want Filebeat to follow Elastic Common Schema. And setting
host
with the correct value is part of that. By setting (or not setting host) we can better interpret events. Without this change the Filebeat host is being attributed as the source of many cloud based audit/login events.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Run this config from the x-pack/filebeat dir and verify that events do not contain
host
.Related issues
Relates: #13920
Requires: #18159