[Filebeat] Duplicating events in log rotation when output is down #17963
Labels
bug
Filebeat
Filebeat
Stalled
Team:Services
(Deprecated) Label for the former Integrations-Services team
Version: 7.5.0
Description:
When filebeat output is not running for long period of time and logs are rotating during this time we can see duplicates.
All the log files (my-server.log.1,my-server.log.2...) are read again.
Configuration:
Filebeat input: reading from rotation log
Filebeat output for example logstash.
Step (2.1.3, 2.2.2, and 2.3.2) will block until the output is up again.
When output is down and the logs are rotated, the input is blocked in the "State update phase" (e.g. 2.1.3 or 2.2.2). Once output is up again, the input continues with the 'Cleanup phase', which detects that the current on disk state does not match the internal state anymore. States get removed.
Normally the cleanup phase is expected to run right after the state collection phase. But due to the input being blocked, the input did continue the state cleanup with some very old outdated state.
Workaround:
If we have to stop output for long period of time and we want to avoid duplicates:
https://www.elastic.co/blog/logstash-lessons-handling-duplicates
https://www.elastic.co/blog/efficient-duplicate-prevention-for-event-based-data-in-elasticsearch
Related github:
Filebeat input v2 API #15324
The text was updated successfully, but these errors were encountered: