Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] user_agent parsing error while ingesting web logs with filebeat 6.7.0 into elasticsearch 7.0.0 #10650

Closed
weltenwort opened this issue Feb 8, 2019 · 15 comments
Assignees
Labels
bug Filebeat Filebeat Team:Integrations Label for the Integrations team

Comments

@weltenwort
Copy link
Member

weltenwort commented Feb 8, 2019

Versions:

  • filebeat 6.7.0-SNAPSHOT (build hash 9e0ed82)
  • elasticsearch 7.0.0-SNAPSHOT (build hash 5e798c1)

Operating System: Linux 4.20.6-arch1-1-ARCH elastic/beats#1 SMP PREEMPT Thu Jan 31 08:22:01 UTC 2019 x86_64 GNU/Linux

Description:

When indexing the filebeat test data from the beats 6.7 branch into a 7.0.0-SNAPSHOT elasticsearch cluster, the access logs for the web servers (at least nginx, iis and traefik) fail to be indexed with errors messages akin to the following:

info [o.e.a.b.TransportShardBulkAction] [${HOSTNAME}] [filebeat-6.7.0-2019.02.08][1] failed to execute bulk item (index) index {[filebeat-6.7.0-2019.02.08][_doc][-v9vzWgBSKfxSV4q4CHr], source[{"offset":1204,"log":{"file":{"path":"${SOMEDIR}/beats/filebeat/module/iis/access/test/test.log"}},"prospector":{"type":"log"},"read_timestamp":"2019-02-08T14:08:07.032Z","source":"${SOMEDIR}/beats/filebeat/module/iis/access/test/test.log","fileset":{"module":"iis","name":"access"},"error":{"message":"field [iis.access.user_agent.original] already exists"},"input":{"type":"log"},"iis":{"access":{"server_name":"MACHINE-NAME","agent":"Mozilla/5.0+(Windows+NT+6.1;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0","response_code":"200","cookie":"-","method":"GET","sub_status":"0","user_name":"-","http_version":"1.1","url":"/","site_name":"W3SVC1","referrer":"-","body_received":{"bytes":"456"},"hostname":"example.com","remote_ip":"85.181.35.98","port":"80","server_ip":"127.0.0.1","body_sent":{"bytes":"123"},"win32_status":"0","request_time_ms":"789","query_string":"-","user_agent":{"original":"Mozilla/5.0+(Windows+NT+6.1;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0","os":{"name":"Windows"},"name":"Firefox","device":{"name":"Other"},"version":"57.0"}}},"@timestamp":"2018-01-01T10:11:12.000Z","beat":{"hostname":"${HOSTNAME}","name":"${HOSTNAME}","version":"6.7.0"},"host":{"os":{"build":"rolling","name":"Arch Linux","family":"","version":"","platform":"arch"},"containerized":false,"name":"${HOSTNAME}","id":"${HOSTID}","architecture":"x86_64"},"event":{"dataset":"iis.access"}}]}
   │      org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [iis.access.user_agent.os] of type [keyword] in document with id '-v9vzWgBSKfxSV4q4CHr'
   |      ...SNIP...
   │      Caused by: java.lang.IllegalStateException: Can't get text on a START_OBJECT at 1:419

I would suspect that the user_agent.original field, which is already populated by user_agent ingest processor in elasticsearch 7.0.0, causes the rename operation in the version 6.7.0 pipeline to fail.

I haven't tested all of them, but this probably happens for all filebeat web server modules that use the user_agent processor in the pipeline.

Steps to Reproduce:

  1. Start an elasticsearch 7.0.0 SNAPSHOT
  2. Configure filebeat to connect to the elasticsearch 7.0.0 cluster
  3. Enable the web server modules such as nginx or iis
  4. Change the module configuration to point to the corresponding filebeat test log samples from the 6.7 branch of the beats repo
  5. Start filebeat
  6. Observe the filebeat and elasticsearch logs
@ruflin ruflin added the Team:Integrations Label for the Integrations team label Feb 8, 2019
@weltenwort weltenwort changed the title [Filebeat] user_argent parsing error while ingesting web logs with filebeat 6.7.0 into elasticsearch 7.0.0 [Filebeat] user_agent parsing error while ingesting web logs with filebeat 6.7.0 into elasticsearch 7.0.0 Feb 8, 2019
@ruflin
Copy link
Member

ruflin commented Feb 11, 2019

I think the problem here is that the user_agent processor changed the format between 6.x and 7.x to align with ECS. The problem we have now is that the data created by Filebeat 6.7 with Elasticsearch 6.7 or 7.0 is not identical and even conflicts.

Options:

  1. Introduce ecs: false in the ingest processor in 6.x. This would required that ecs: false is still supported by Elasticsearch which is not the case.
  2. Use ecs: true in 6.x to already generate ECS data. This would be a breaking change in 6.7
  3. Fix issue with user_agent.original by checking if the field already exist. Document the fact that when Elasticsearch is upgraded to 7.0, Filebeat will start to generate different data structure for the user_agent.
  4. Have painless scripts in place that when running Filebeat 6.x against Elasticsearch 7.x, still the same data structure is generated. This would mean quite a bit of complexity in the ingest processor if even possible.

Introducing an ecs: * config in Option 1 and 2 means breaking compatibility with Elasticsearch versions older then 6.6 as the ingest processor checks if there are config options that should not be there and rejects ecs:* configs.

My current suggestion is that we go with option 3 and make users aware that when upgrading Elasticsearch, the structure of the data will slightly change. We must ensure on our side, that it's not conflicting with previous data. This could also have an affect on some dashboards (needs verification).

Option 1 would be the most seamless one from a user perspective but it would require Elasticsearch to keep the old ingest processor around for all of 7.x.

@ruflin
Copy link
Member

ruflin commented Feb 11, 2019

Trying to implement Option 3 I stumbled over more issues:

  • device is a keyword in 6.x but an object in the 7.x processor. Meaning the user_agent processor outcome will conflict with the template.
  • os has the exact same problem.

I will now play around with option 4.

We have also the problem the other way around with Filebeat 7.x sending data to 6.x: #10655

ruflin added a commit to ruflin/beats that referenced this issue Feb 11, 2019
This is an attempt to solve elastic#10650

The conflicting fields are

* os.*
* device.*

These fields are renamed to the old fields if they exist and then the generated key is removed. Right now I check if any of the os_* fields exists but it would be nice to check if `os` and `device` are an object but haven't found yet how to do this in painless.
@ruflin
Copy link
Member

ruflin commented Feb 11, 2019

Here is a first attempt to solve this with renaming of fields for apache.access logs: #10661

@jakelandis We should also discuss to keep the ecs flag around in 7.x as the approach above seems pretty unstable.

@bleskes
Copy link

bleskes commented Feb 11, 2019

@ruflin, if I understand correctly, the user agent processor in 7.0 currently only supports the JSON format that beats 7.0 ship and breaks on 6.7 data. That's a no go from our upgrade perspective (upgrade ES first, then Kibana, the data shippers), so we indeed need to fix this. Also, this means that the 6.7 structure is really different and that we can detect it and do something else in the user agent ingest processor to support the 6.7 formats. This is basically what you mean with option 3, right? (apologies but I'm not familiar with the details of the specific field you mentioned). If so I'm +1 on that direction. Also, ideally, the 7.0 ingest processor will produce ECS compatible documents, even if it starts with the 6.7 format.

@ruflin
Copy link
Member

ruflin commented Feb 11, 2019

As an example the 6.7 user_agent processor creates the field device. The 7.0 processor creates the field device.name. This means we have a keyword field conflicting with an object field. There are more fields with the same problem for example in os.*.

The Filebeat indices are versioned per Beat version. Upgrading Elasticsearch to 7.0 will mean the Beat still ingests to the same index and the type cannot change.

Proposal 3 has become obsolete and is now the same as option 4 because there are more fields then just .original (see notes above). This leaves us with 2 options:

  1. Filebeat ingest processor "detects" that 7.x user_agent processor was used and converts the data to be compatible with 6.x
  2. Elasticsearch 7.x has the same user_agent processor as in 6.7 but ecs is set to true instead of false by default. It could still be deprecated.

The outcome on the data structure side is very similar. I would also like to discuss option 2. I could see this helping also other users upgrading. The main downside is that we have leftover code in ES 7.

@ruflin ruflin self-assigned this Feb 11, 2019
@jakelandis
Copy link

The Filebeat indices are versioned per Beat version

Is it possible for Filebeat 6.x to detect it's running against a 7.x cluster and use a slightly different index name to avoid mapping errors ?

My concern with option 2 is that there is no motivation to start using the ecs version other then a mildly annoying deprecation warning. If we went with option 2, when 8.0.0 comes out, would we have this same conversation w.r.t removing the deprecated flag ?

@bleskes
Copy link

bleskes commented Feb 11, 2019

@jakelandis when we have 8.0 come out we could remove the flag as it has been deprecated and all the beats that are eligible to speak to 8.0 will not be setting it at all.

@jakelandis
Copy link

@bleskes Do all 6.x versions of Beats need to talk with 7.0 ES ? (or just Beats 6.7 ?)

@jakelandis
Copy link

Spoke with @jasontedor and cleared up a few things in person. We will add the flag and functionality to 7.0.0 (deprecated), and leave it removed for 8.0.0.

@jakelandis
Copy link

@ruflin - If I change the default value to true in 7.0 will that still work for you ?

So it would be:
6.7 : default false
7.x : default true
8.0 : gone (with true behavior)

jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Feb 12, 2019
…8115)"

This reverts commit 5b008a3.

Related: elastic/beats#10650

Will replace this commit with the 6.7 version
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Feb 12, 2019
elastic#37984)"

This reverts commit cac6b8e.

Related: elastic/beats#10650

Will replace this commit with the 6.7 version
@ruflin
Copy link
Member

ruflin commented Feb 12, 2019

@jakelandis Perfect, this is exactly as expected.

ruflin added a commit to ruflin/beats that referenced this issue Feb 12, 2019
To make sure the same data structure is ingested in Elasticsearch 6.7 and 7.0 when running Filebeat 6.7, the user_agent processor flag `ecs: false` must be set. Otherwise the data structure would change and data structure conflicts would happen (see elastic#10650).

This change requires Elasticsearch to support the `ecs: false` flag in 7.x.

Adding the `ecs: flag` will mean Filebeat 6.7 stops working with Elasticsearch 6.5 or older as the flag is not supported.
ruflin added a commit that referenced this issue Feb 12, 2019
To make sure the same data structure is ingested in Elasticsearch 6.7 and 7.0 when running Filebeat 6.7, the user_agent processor flag `ecs: false` must be set. Otherwise the data structure would change and data structure conflicts would happen (see #10650).

This change requires Elasticsearch to support the `ecs: false` flag in 7.x.

Adding the `ecs: flag` will mean Filebeat 6.7 stops working with Elasticsearch 6.5 or older as the flag is not supported.
@ruflin
Copy link
Member

ruflin commented Feb 12, 2019

#10688 was merged and elastic/elasticsearch#38757 seems to be almost ready. I tested to two together and seems to work as expected. Will keep this issue open until also elastic/elasticsearch#38757 is merged.

@jakelandis
Copy link

elastic/elasticsearch#38757 has been merged into 7.0 branch and will make the 7.0.0-rc1 release. It will soon be merged to the 7.x branch for inclusion in 7.1.

@bleskes
Copy link

bleskes commented Feb 13, 2019

@jakelandis many 🙏

@ruflin
Copy link
Member

ruflin commented Feb 15, 2019

Closing as all related PR's were merged.

@ruflin ruflin closed this as completed Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Filebeat Filebeat Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

4 participants