`encoding`

The file encoding to use for reading data that contains international characters. See the encoding names recommended by the W3C for use in HTML5.

Valid encodings:

plain: plain ASCII encoding
utf-8 or utf8: UTF-8 encoding
gbk: simplified Chinese charaters
iso8859-6e: ISO8859-6E, Latin/Arabic
iso8859-6i: ISO8859-6I, Latin/Arabic
iso8859-8e: ISO8859-8E, Latin/Hebrew
iso8859-8i: ISO8859-8I, Latin/Hebrew
iso8859-1: ISO8859-1, Latin-1
iso8859-2: ISO8859-2, Latin-2
iso8859-3: ISO8859-3, Latin-3
iso8859-4: ISO8859-4, Latin-4
iso8859-5: ISO8859-5, Latin/Cyrillic
iso8859-6: ISO8859-6, Latin/Arabic
iso8859-7: ISO8859-7, Latin/Greek
iso8859-8: ISO8859-8, Latin/Hebrew
iso8859-9: ISO8859-9, Latin-5
iso8859-10: ISO8859-10, Latin-6
iso8859-13: ISO8859-13, Latin-7
iso8859-14: ISO8859-14, Latin-8
iso8859-15: ISO8859-15, Latin-9
iso8859-16: ISO8859-16, Latin-10
cp437: IBM CodePage 437
cp850: IBM CodePage 850
cp852: IBM CodePage 852
cp855: IBM CodePage 855
cp858: IBM CodePage 858
cp860: IBM CodePage 860
cp862: IBM CodePage 862
cp863: IBM CodePage 863
cp865: IBM CodePage 865
cp866: IBM CodePage 866
ebcdic-037: IBM CodePage 037
ebcdic-1040: IBM CodePage 1140
ebcdic-1047: IBM CodePage 1047
koi8r: KOI8-R, Russian (Cyrillic)
koi8u: KOI8-U, Ukranian (Cyrillic)
macintosh: Macintosh encoding
macintosh-cyrillic: Macintosh Cyrillic encoding
windows1250: Windows1250, Central and Eastern European
windows1251: Windows1251, Russian, Serbian (Cyrillic)
windows1252: Windows1252, Legacy
windows1253: Windows1253, Modern Greek
windows1254: Windows1254, Turkish
windows1255: Windows1255, Hebrew
windows1256: Windows1256, Arabic
windows1257: Windows1257, Estonian, Latvian, Lithuanian
windows1258: Windows1258, Vietnamese
windows874: Windows874, ISO/IEC 8859-11, Latin/Thai
utf-16-bom: UTF-16 with required BOM
utf-16be-bom: big endian UTF-16 with required BOM
utf-16le-bom: little endian UTF-16 with required BOM

The plain encoding is special, because it does not validate or transform any input.

`exclude_lines`

A list of regular expressions to match the lines that you want {beatname_uc} to exclude. {beatname_uc} drops any lines that match a regular expression in the list. By default, no lines are dropped. Empty lines are ignored.

The following example configures {beatname_uc} to drop any lines that start with DBG.

{beatname_lc}.inputs:
- type: {type}
  ...
  exclude_lines: ['^DBG']

See [regexp-support] for a list of supported regexp patterns.

`include_lines`

A list of regular expressions to match the lines that you want {beatname_uc} to include. {beatname_uc} exports only the lines that match a regular expression in the list. By default, all lines are exported. Empty lines are ignored.

The following example configures {beatname_uc} to export any lines that start with ERR or WARN:

{beatname_lc}.inputs:
- type: {type}
  ...
  include_lines: ['^ERR', '^WARN']

Note

If both include_lines and exclude_lines are defined, {beatname_uc} executes include_lines first and then executes exclude_lines. The order in which the two options are defined doesn’t matter. The include_lines option will always be executed before the exclude_lines option, even if exclude_lines appears before include_lines in the config file.

The following example exports all log lines that contain sometext, except for lines that begin with DBG (debug messages):

{beatname_lc}.inputs:
- type: {type}
  ...
  include_lines: ['sometext']
  exclude_lines: ['^DBG']

See [regexp-support] for a list of supported regexp patterns.

`buffer_size`

The size in bytes of the buffer that each harvester uses when fetching a file. The default is 16384.

`message_max_bytes`

The maximum number of bytes that a single log message can have. All bytes after message_max_bytes are discarded and not sent. The default is 10MB (10485760).

`parsers`

This option expects a list of parsers that the log line has to go through.

Available parsers:

multiline
ndjson
container
syslog

In this example, {beatname_uc} is reading multiline messages that consist of 3 lines and are encapsulated in single-line JSON objects. The multiline message is stored under the key msg.

{beatname_lc}.inputs:
- type: {type}
  ...
  parsers:
    - ndjson:
        target: ""
        message_key: msg
    - multiline:
        type: count
        count_lines: 3

See the available parser settings in detail below.

`multiline`

Options that control how {beatname_uc} deals with log messages that span multiple lines. See [multiline-examples] for more information about configuring multiline options.

`ndjson`

These options make it possible for {beatname_uc} to decode logs structured as JSON messages. {beatname_uc} processes the logs line by line, so the JSON decoding only works if there is one JSON object per message.

The decoding happens before line filtering. You can combine JSON decoding with filtering if you set the message_key option. This can be helpful in situations where the application logs are wrapped in JSON objects, like when using Docker.

Example configuration:

- ndjson:
    target: ""
    add_error_key: true
    message_key: log

target: The name of the new JSON object that should contain the parsed key value pairs. If you leave it empty, the new keys will go under root.
overwrite_keys: Values from the decoded JSON object overwrite the fields that {beatname_uc} normally adds (type, source, offset, etc.) in case of conflicts. Disable it if you want to keep previously added values.
expand_keys: If this setting is enabled, {beatname_uc} will recursively de-dot keys in the decoded JSON, and expand them into a hierarchical object structure. For example, {"a.b.c": 123} would be expanded into {"a":{"b":{"c":123}}}. This setting should be enabled when the input is produced by an ECS logger.
add_error_key: If this setting is enabled, {beatname_uc} adds an "error.message" and "error.type: json" key in case of JSON unmarshalling errors or when a message_key is defined in the configuration but cannot be used.
message_key: An optional configuration setting that specifies a JSON key on which to apply the line filtering and multiline settings. If specified the key must be at the top level in the JSON object and the value associated with the key must be a string, otherwise no filtering or multiline aggregation will occur.
document_id: Option configuration setting that specifies the JSON key to set the document id. If configured, the field will be removed from the original JSON document and stored in @metadata._id
ignore_decoding_error: An optional configuration setting that specifies if JSON decoding errors should be logged or not. If set to true, errors will not be logged. The default is false.

`container`

Use the container parser to extract information from containers log files. It parses lines into common message lines, extracting timestamps too.

stream: Reads from the specified streams only: all, stdout or stderr. The default is all.
format: Use the given format when parsing logs: auto, docker or cri. The default is auto, it will automatically detect the format. To disable autodetection set any of the other options.

The following snippet configures {beatname_uc} to read the stdout stream from all containers under the default Kubernetes logs path:

  paths:
    - "/var/log/containers/*.log"
  parsers:
    - container:
        stream: stdout

`syslog`

The syslog parser parses RFC 3146 and/or RFC 5424 formatted syslog messages.

The supported configuration options are:

format: (Optional) The syslog format to use, rfc3164, or rfc5424. To automatically detect the format from the log entries, set this option to auto. The default is auto.
timezone: (Optional) IANA time zone name(e.g. America/New York) or a fixed time offset (e.g. +0200) to use when parsing syslog timestamps that do not contain a time zone. Local may be specified to use the machine’s local time zone. Defaults to Local.
log_errors: (Optional) If true the parser will log syslog parsing errors. Defaults to false.
add_error_key: (Optional) If this setting is enabled, the parser adds or appends to an error.message key with the parsing error that was encountered. Defaults to true.

Example configuration:

- syslog:
    format: rfc3164
    timezone: America/Chicago
    log_errors: true
    add_error_key: true

Timestamps

The RFC 3164 format accepts the following forms of timestamps:

Local timestamp (Mmm dd hh:mm:ss):
- Jan 23 14:09:01
RFC-3339*:
- 2003-10-11T22:14:15Z
- 2003-10-11T22:14:15.123456Z
- 2003-10-11T22:14:15-06:00
- 2003-10-11T22:14:15.123456-06:00

Note: The local timestamp (for example, Jan 23 14:09:01) that accompanies an RFC 3164 message lacks year and time zone information. The time zone will be enriched using the timezone configuration option, and the year will be enriched using the {beatname_uc} system’s local time (accounting for time zones). Because of this, it is possible for messages to appear in the future. An example of when this might happen is logs generated on December 31 2021 are ingested on January 1 2022. The logs would be enriched with the year 2022 instead of 2021.

The RFC 5424 format accepts the following forms of timestamps:

RFC-3339:
- 2003-10-11T22:14:15Z
- 2003-10-11T22:14:15.123456Z
- 2003-10-11T22:14:15-06:00
- 2003-10-11T22:14:15.123456-06:00

Formats with an asterisk (*) are a non-standard allowance.

`include_message`

Use the include_message parser to filter messages in the parsers pipeline. Messages that match the provided pattern are passed to the next parser, the others are dropped.

You should use include_message instead of include_lines if you would like to control when the filtering happens. include_lines runs after the parsers, include_message runs in the parsers pipeline.

patterns: List of regexp patterns to match.

This example shows you how to include messages that start with the string ERR or WARN:

  paths:
    - "/var/log/containers/*.log"
  parsers:
    - include_message.patterns: ["^ERR", "^WARN"]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input-filestream-reader-options.asciidoc

input-filestream-reader-options.asciidoc

`encoding`

`exclude_lines`

`include_lines`

`buffer_size`

`message_max_bytes`

`parsers`

`multiline`

`ndjson`

`container`

`syslog`

`include_message`

Files

input-filestream-reader-options.asciidoc

Latest commit

History

input-filestream-reader-options.asciidoc

File metadata and controls

encoding

exclude_lines

include_lines

buffer_size

message_max_bytes

parsers

multiline

ndjson

container

syslog

include_message

`encoding`

`exclude_lines`

`include_lines`

`buffer_size`

`message_max_bytes`

`parsers`

`multiline`

`ndjson`

`container`

`syslog`

`include_message`