Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate uses of the logfile input to filestream #2518

Open
80 tasks
kvch opened this issue Jan 12, 2022 · 4 comments
Open
80 tasks

Migrate uses of the logfile input to filestream #2518

kvch opened this issue Jan 12, 2022 · 4 comments
Assignees
Labels
draft Draft Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team [elastic/elastic-agent-data-plane] technical-debt

Comments

@kvch
Copy link
Contributor

kvch commented Jan 12, 2022

Goal

The goal of this issue is to migrate existing packages that rely on log (logfile) input to filestream. Updating the package must be backwards compatible. The change in the integration package should be hidden from users.

The only user-visible change should be the value of input.type in the event from log to filestream.

Why migrate?

The new filestream input has replaced the good, old log a.k.a. logfile input in Beats. The filestream input is GA since 7.16 and at the same time logfile was deprecated. In the last few releases, we added numerous bug fixes to the new input, and now we are working on enhancements. It is stable enough for adoption in Integrations.

It comes with several improvements over the old input: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#filebeat-input-filestream

Differences

There are several differences in the configuration of the inputs:

How to migrate integrations?

Some of the changes might be automated, for example, renaming close_removed to close.on_state_change.removed. But some options require manual checking and adjustments e.g. the parsing of lines. Also, there are new options, like include_files the counterpart of exclude_files. Those should be validated to see if existing configurations could be improved.

General steps for migrating a package

  • Does it include options that can be replaced?
    • exclude_files -> prospector.scanner.exclude_files
    • recursive_glob -> prospector.scanner.recursive_glob
    • symlinks -> prospector.scanner.symlinks
    • scan_interval -> prospector.scanner.interval
    • close_inactive -> close.on_state_change.inactive
    • close_renamed -> close.on_state_change.renamed
    • close_removed -> close.on_state_change.removed
    • close_eof -> close.reader.eof
    • close_timeout -> close.reader.after_interval
    • harvester_buffer_size -> buffer_size
    • max_bytes -> message_max_bytes
  • Rewrite parsing configuration
    • all parsers must go under parsers option
    • order of parsing can be changed
    • json is renamed to ndjson
    • new parser called container is preferred over container input
  • Can any of the new options be adopted?

How to migrate on Filebeat side?

If someone has been using e.g. Apache integrations and updates to the new version, input change must not be visible to users. Upgrading the package must not mean that the monitored files are read from the beginning. State information from the log input has to be passed to the filestream input, so it can continue where log input has left off. Given that position tracking is similar in the inputs, changing the state ID from log::{id}::{device}-{inode} to filestream::{id}::{device}-{inode} should work.

Tasks

Beats side migration

  • support for reading state information of log input from filestream input

Packages

This is the list of packages that use logfile input to collect data.

  • activemq
  • apache
  • atlassian_bitbucket
  • atlassian_confluence
  • atlassian_jira
  • auditd
  • barracuda
  • bbot
  • bluecoat
  • carbonblack_edr
  • cassandra
  • cef
  • checkpoint
  • cisco_aironet
  • cisco_asa
  • cisco_ftd
  • cisco_ios
  • cisco_meraki
  • cisco_secure_email_gateway
  • citrix_adc
  • citrix_waf
  • crowdstrike
  • cyberarkpas
  • cylance
  • elasticsearch
  • f5
  • falco
  • fireeye
  • forcepoint_web
  • fortinet_forticlient
  • fortinet_fortiedr
  • fortinet_fortigate
  • haproxy
  • hashicorp_vault
  • ibmmq
  • iis
  • infoblox_nios
  • iptables
  • kafka
  • kibana
  • logstash
  • mattermost
  • microsoft_defender_endpoint
  • microsoft_dhcp
  • microsoft_exchange_online_message_trace
  • microsoft_sqlserver
  • modsecurity
  • mongodb
  • mysql
  • nats
  • netscout
  • nginx
  • opencanary
  • oracle_weblogic
  • osquery
  • panw
  • platform_observability
  • postgresql
  • pps
  • rabbitmq
  • radware
  • redis
  • santa
  • snort
  • sonicwall_firewall
  • sophos
  • squid
  • stan
  • suricata
  • symantec_endpoint
  • system
  • thycotic_ss
  • ti_recordedfuture
  • tomcat
  • traefik
  • zeek
@kvch kvch changed the title Use filestream input in packages Drop logfile input Jan 12, 2022
@kvch kvch added the draft Draft label Jan 14, 2022
@cmacknz cmacknz added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team [elastic/elastic-agent-data-plane] label Jun 16, 2022
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@cmacknz cmacknz added the bug Something isn't working, use only for issues label Jun 16, 2022
@cmacknz
Copy link
Member

cmacknz commented Jun 16, 2022

I've updated the list of integrations using the logfile input type based on usage as of today.

@botelastic
Copy link

botelastic bot commented Jun 16, 2023

Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jun 16, 2023
@mbudge
Copy link

mbudge commented Oct 30, 2023

Please make sure this is available in the custom logs integration

scan_frequency
ignore_older
close_inactive
harvester_limit
prospector.scanner.include_files

@botelastic botelastic bot removed the Stalled label Oct 30, 2023
@cmacknz cmacknz changed the title Drop logfile input Migrate uses of the logfile input to filestream Aug 15, 2024
@flexitrev flexitrev self-assigned this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
draft Draft Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team [elastic/elastic-agent-data-plane] technical-debt
Projects
None yet
Development

No branches or pull requests

5 participants