-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] user_agent parsing error while ingesting web logs with filebeat 6.7.0 into elasticsearch 7.0.0 #10650
Comments
I think the problem here is that the Options:
Introducing an My current suggestion is that we go with option 3 and make users aware that when upgrading Elasticsearch, the structure of the data will slightly change. We must ensure on our side, that it's not conflicting with previous data. This could also have an affect on some dashboards (needs verification). Option 1 would be the most seamless one from a user perspective but it would require Elasticsearch to keep the old ingest processor around for all of 7.x. |
Trying to implement Option 3 I stumbled over more issues:
I will now play around with option 4. We have also the problem the other way around with Filebeat 7.x sending data to 6.x: #10655 |
This is an attempt to solve elastic#10650 The conflicting fields are * os.* * device.* These fields are renamed to the old fields if they exist and then the generated key is removed. Right now I check if any of the os_* fields exists but it would be nice to check if `os` and `device` are an object but haven't found yet how to do this in painless.
Here is a first attempt to solve this with renaming of fields for apache.access logs: #10661 @jakelandis We should also discuss to keep the |
@ruflin, if I understand correctly, the user agent processor in 7.0 currently only supports the JSON format that beats 7.0 ship and breaks on 6.7 data. That's a no go from our upgrade perspective (upgrade ES first, then Kibana, the data shippers), so we indeed need to fix this. Also, this means that the 6.7 structure is really different and that we can detect it and do something else in the user agent ingest processor to support the 6.7 formats. This is basically what you mean with option 3, right? (apologies but I'm not familiar with the details of the specific field you mentioned). If so I'm +1 on that direction. Also, ideally, the 7.0 ingest processor will produce ECS compatible documents, even if it starts with the 6.7 format. |
As an example the 6.7 The Filebeat indices are versioned per Beat version. Upgrading Elasticsearch to 7.0 will mean the Beat still ingests to the same index and the type cannot change. Proposal 3 has become obsolete and is now the same as option 4 because there are more fields then just
The outcome on the data structure side is very similar. I would also like to discuss option 2. I could see this helping also other users upgrading. The main downside is that we have leftover code in ES 7. |
Is it possible for Filebeat 6.x to detect it's running against a 7.x cluster and use a slightly different index name to avoid mapping errors ? My concern with option 2 is that there is no motivation to start using the |
@jakelandis when we have 8.0 come out we could remove the flag as it has been deprecated and all the beats that are eligible to speak to 8.0 will not be setting it at all. |
@bleskes Do all 6.x versions of Beats need to talk with 7.0 ES ? (or just Beats 6.7 ?) |
Spoke with @jasontedor and cleared up a few things in person. We will add the flag and functionality to 7.0.0 (deprecated), and leave it removed for 8.0.0. |
@ruflin - If I change the default value to So it would be: |
…8115)" This reverts commit 5b008a3. Related: elastic/beats#10650 Will replace this commit with the 6.7 version
elastic#37984)" This reverts commit cac6b8e. Related: elastic/beats#10650 Will replace this commit with the 6.7 version
@jakelandis Perfect, this is exactly as expected. |
To make sure the same data structure is ingested in Elasticsearch 6.7 and 7.0 when running Filebeat 6.7, the user_agent processor flag `ecs: false` must be set. Otherwise the data structure would change and data structure conflicts would happen (see elastic#10650). This change requires Elasticsearch to support the `ecs: false` flag in 7.x. Adding the `ecs: flag` will mean Filebeat 6.7 stops working with Elasticsearch 6.5 or older as the flag is not supported.
To make sure the same data structure is ingested in Elasticsearch 6.7 and 7.0 when running Filebeat 6.7, the user_agent processor flag `ecs: false` must be set. Otherwise the data structure would change and data structure conflicts would happen (see #10650). This change requires Elasticsearch to support the `ecs: false` flag in 7.x. Adding the `ecs: flag` will mean Filebeat 6.7 stops working with Elasticsearch 6.5 or older as the flag is not supported.
#10688 was merged and elastic/elasticsearch#38757 seems to be almost ready. I tested to two together and seems to work as expected. Will keep this issue open until also elastic/elasticsearch#38757 is merged. |
elastic/elasticsearch#38757 has been merged into 7.0 branch and will make the 7.0.0-rc1 release. It will soon be merged to the 7.x branch for inclusion in 7.1. |
@jakelandis many 🙏 |
Closing as all related PR's were merged. |
Versions:
Operating System:
Linux 4.20.6-arch1-1-ARCH elastic/beats#1 SMP PREEMPT Thu Jan 31 08:22:01 UTC 2019 x86_64 GNU/Linux
Description:
When indexing the filebeat test data from the beats 6.7 branch into a 7.0.0-SNAPSHOT elasticsearch cluster, the access logs for the web servers (at least nginx, iis and traefik) fail to be indexed with errors messages akin to the following:
I would suspect that the
user_agent.original
field, which is already populated byuser_agent
ingest processor in elasticsearch 7.0.0, causes therename
operation in the version 6.7.0 pipeline to fail.I haven't tested all of them, but this probably happens for all filebeat web server modules that use the
user_agent
processor in the pipeline.Steps to Reproduce:
nginx
oriis
The text was updated successfully, but these errors were encountered: