From aef1ef073c8beb05018456f2c345eeef38fdb41e Mon Sep 17 00:00:00 2001 From: Brandon Morelli Date: Mon, 31 Aug 2020 17:07:47 -0700 Subject: [PATCH] docs: Add `processor.event` info to Logstash output (#20721) --- .../outputs/logstash/docs/logstash.asciidoc | 113 ++++++++++-------- 1 file changed, 62 insertions(+), 51 deletions(-) diff --git a/libbeat/outputs/logstash/docs/logstash.asciidoc b/libbeat/outputs/logstash/docs/logstash.asciidoc index 122f1178b2e..910551f9252 100644 --- a/libbeat/outputs/logstash/docs/logstash.asciidoc +++ b/libbeat/outputs/logstash/docs/logstash.asciidoc @@ -5,8 +5,8 @@ Logstash ++++ -The Logstash output sends events directly to Logstash by using the lumberjack -protocol, which runs over TCP. Logstash allows for additional processing and routing of +The {ls} output sends events directly to {ls} by using the lumberjack +protocol, which runs over TCP. {ls} allows for additional processing and routing of generated events. // tag::shared-logstash-config[] @@ -26,11 +26,10 @@ If you want to use {ls} to perform additional processing on the data collected b To do this, edit the {beatname_uc} configuration file to disable the {es} output by commenting it out and enable the {ls} output by uncommenting the -logstash section: +{ls} section: [source,yaml] ------------------------------------------------------------------------------ -#----------------------------- Logstash output -------------------------------- output.logstash: hosts: ["127.0.0.1:5044"] ------------------------------------------------------------------------------ @@ -51,8 +50,8 @@ endif::[] ==== Accessing metadata fields -Every event sent to Logstash contains the following metadata fields that you can -use in Logstash for indexing and filtering: +Every event sent to {ls} contains the following metadata fields that you can +use in {ls} for indexing and filtering: ifndef::apm-server[] ["source","json",subs="attributes"] @@ -65,12 +64,15 @@ ifndef::apm-server[] } } ------------------------------------------------------------------------------ -<1> {beatname_uc} uses the `@metadata` field to send metadata to Logstash. See the -{logstash-ref}/event-dependent-configuration.html#metadata[Logstash documentation] +<1> {beatname_uc} uses the `@metadata` field to send metadata to {ls}. See the +{logstash-ref}/event-dependent-configuration.html#metadata[{ls} documentation] for more about the `@metadata` field. <2> The default is {beat_default_index_prefix}. To change this value, set the <> option in the {beatname_uc} config file. <3> The current version of {beatname_uc}. + +You can access this metadata from within the {ls} config file to set values +dynamically based on the contents of the metadata. endif::[] ifdef::apm-server[] @@ -85,24 +87,24 @@ ifdef::apm-server[] } } ------------------------------------------------------------------------------ -<1> {beatname_uc} uses the `@metadata` field to send metadata to Logstash. See the -{logstash-ref}/event-dependent-configuration.html#metadata[Logstash documentation] +<1> {beatname_uc} uses the `@metadata` field to send metadata to {ls}. See the +{logstash-ref}/event-dependent-configuration.html#metadata[{ls} documentation] for more about the `@metadata` field. <2> The default is {beat_default_index_prefix}. To change this value, set the <> option in the {beatname_uc} config file. <3> The default pipeline configuration: `apm`. Additional pipelines can be enabled -with a {logstash-ref}/use-ingest-pipelines.html[Logstash pipeline config]. +with a {logstash-ref}/use-ingest-pipelines.html[{ls} pipeline config]. <4> The current version of {beatname_uc}. -endif::[] -You can access this metadata from within the Logstash config file to set values -dynamically based on the contents of the metadata. - -For example, the following Logstash configuration file tells -Logstash to use the index reported by {beatname_uc} for indexing events -into Elasticsearch: +In addition to metadata, {beatname_uc} provides the `processor.event` field, which +can be used to separate {apm-overview-ref-v}/apm-data-model.html[event types] into different indices. +endif::[] ifndef::apm-server[] +For example, the following {ls} configuration file tells +{ls} to use the index reported by {beatname_uc} for indexing events +into {es}: + [source,logstash] ------------------------------------------------------------------------------ @@ -126,6 +128,10 @@ the Beat's version. For example: endif::[] ifdef::apm-server[] +For example, the following {ls} configuration file tells +{ls} to use the index and event types reported by {beatname_uc} for indexing events +into {es}: + [source,logstash] ------ input { @@ -156,26 +162,26 @@ output { } ------ <1> Creates a new field named `@metadata.index`. -`%{[@metadata][beat]}` sets the first part of the index name to the value of the `beat` metadata field. +`%{[@metadata][beat]}` sets the first part of the index name to the value of the `metadata.beat` field. `%{[@metadata][version]}` sets the second part to {beatname_uc}'s version. `%{[processor][event]}` sets the final part based on the APM event type. For example: +{beat_default_index_prefix}-{version}-sourcemap+. -<2> In addition to the above rules, this pattern appends a date to the `index` name so Logstash creates a new index each day. +<2> In addition to the above rules, this pattern appends a date to the `index` name so {ls} creates a new index each day. For example: +{beat_default_index_prefix}-{version}-transaction-{sample_date_0}+. endif::[] -Events indexed into Elasticsearch with the Logstash configuration shown here -will be similar to events directly indexed by {beatname_uc} into Elasticsearch. +Events indexed into {es} with the {ls} configuration shown here +will be similar to events directly indexed by {beatname_uc} into {es}. ifndef::apm-server[] -NOTE: If ILM is not being used, set `index` to `%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}` instead so Logstash creates an index per day, based on the `@timestamp` value of the events coming from Beats. +NOTE: If ILM is not being used, set `index` to `%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}` instead so {ls} creates an index per day, based on the `@timestamp` value of the events coming from Beats. endif::[] ifdef::apm-server[] -==== Logstash and ILM +==== {ls} and ILM -When used with {apm-server-ref}/ilm.html[Index lifecycle management], Logstash does not need to create a new index each day. -Here's a sample Logstash configuration file that would accomplish this: +When used with {apm-server-ref}/ilm.html[Index lifecycle management], {ls} does not need to create a new index each day. +Here's a sample {ls} configuration file that would accomplish this: [source,logstash] ------ @@ -188,15 +194,20 @@ input { output { elasticsearch { hosts => ["http://localhost:9200"] - index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{[processor][event]}" + index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{[processor][event]}" <1> } } ------ +<1> Outputs documents to an index: +`%{[@metadata][beat]}` sets the first part of the index name to the value of the `metadata.beat` field. +`%{[@metadata][version]}` sets the second part to {beatname_uc}'s version. +`%{[processor][event]}` sets the final part based on the APM event type. +For example: +{beat_default_index_prefix}-{version}-sourcemap+. endif::[] ==== Compatibility -This output works with all compatible versions of Logstash. See the +This output works with all compatible versions of {ls}. See the https://www.elastic.co/support/matrix#matrix_compatibility[Elastic Support Matrix]. @@ -220,18 +231,18 @@ endif::[] [[hosts]] ===== `hosts` -The list of known Logstash servers to connect to. If load balancing is disabled, but +The list of known {ls} servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly. -All entries in this list can contain a port number. The default port number 5044 will be used, if no number is given. +All entries in this list can contain a port number. The default port number 5044 will be used if no number is given. ===== `compression_level` The gzip compression level. Setting this value to 0 disables compression. The compression level must be in the range of 1 (best speed) to 9 (best compression). -Increasing the compression level will reduce the network usage but will increase the cpu usage. +Increasing the compression level will reduce the network usage but will increase the CPU usage. The default value is 3. @@ -243,15 +254,15 @@ The default value is `false`. ===== `worker` -The number of workers per configured host publishing events to Logstash. This +The number of workers per configured host publishing events to {ls}. This is best used with load balancing mode enabled. Example: If you have 2 hosts and 3 workers, in total 6 workers are started (3 for each host). [[loadbalance]] ===== `loadbalance` -If set to true and multiple Logstash hosts are configured, the output plugin -load balances published events onto all Logstash hosts. If set to false, +If set to true and multiple {ls} hosts are configured, the output plugin +load balances published events onto all {ls} hosts. If set to false, the output plugin sends all events to only one host (determined at random) and will switch to another host if the selected one becomes unresponsive. The default value is false. @@ -265,28 +276,28 @@ output.logstash: ===== `ttl` -Time to live for a connection to Logstash after which the connection will be re-established. -Useful when Logstash hosts represent load balancers. Since the connections to Logstash hosts +Time to live for a connection to {ls} after which the connection will be re-established. +Useful when {ls} hosts represent load balancers. Since the connections to {ls} hosts are sticky, operating behind load balancers can lead to uneven load distribution between the instances. Specifying a TTL on the connection allows to achieve equal connection distribution between the instances. Specifying a TTL of 0 will disable this feature. The default value is 0. -NOTE: The "ttl" option is not yet supported on an async Logstash client (one with the "pipelining" option set). +NOTE: The "ttl" option is not yet supported on an async {ls} client (one with the "pipelining" option set). ===== `pipelining` -Configures number of batches to be sent asynchronously to logstash while waiting -for ACK from logstash. Output only becomes blocking once number of `pipelining` +Configures the number of batches to be sent asynchronously to {ls} while waiting +for ACK from {ls}. Output only becomes blocking once number of `pipelining` batches have been written. Pipelining is disabled if a value of 0 is configured. The default value is 2. ===== `proxy_url` -The URL of the SOCKS5 proxy to use when connecting to the Logstash servers. The +The URL of the SOCKS5 proxy to use when connecting to the {ls} servers. The value must be a URL with a scheme of `socks5://`. The protocol used to -communicate to Logstash is not based on HTTP so a web-proxy cannot be used. +communicate to {ls} is not based on HTTP so a web-proxy cannot be used. If the SOCKS5 proxy server requires client authentication, then a username and password can be embedded in the URL as shown in the example. @@ -305,8 +316,8 @@ output.logstash: [[logstash-proxy-use-local-resolver]] ===== `proxy_use_local_resolver` -The `proxy_use_local_resolver` option determines if Logstash hostnames are -resolved locally when using a proxy. The default value is false which means +The `proxy_use_local_resolver` option determines if {ls} hostnames are +resolved locally when using a proxy. The default value is false, which means that when a proxy is used the name resolution occurs on the proxy server. [[logstash-index]] @@ -317,17 +328,17 @@ example +"{beat_default_index_prefix}"+ generates +"[{beat_default_index_prefix} indices (for example, +"{beat_default_index_prefix}-{version}-2017.04.26"+). NOTE: This parameter's value will be assigned to the `metadata.beat` field. It -can then be accessed in Logstash's output section as `%{[@metadata][beat]}`. +can then be accessed in {ls}'s output section as `%{[@metadata][beat]}`. ===== `ssl` -Configuration options for SSL parameters like the root CA for Logstash connections. See +Configuration options for SSL parameters like the root CA for {ls} connections. See <> for more information. To use SSL, you must also configure the https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html[Beats input plugin for Logstash] to use SSL/TLS. ===== `timeout` -The number of seconds to wait for responses from the Logstash server before timing out. The default is 30 (seconds). +The number of seconds to wait for responses from the {ls} server before timing out. The default is 30 (seconds). ===== `max_retries` @@ -346,7 +357,7 @@ endif::[] ===== `bulk_max_size` -The maximum number of events to bulk in a single Logstash request. The default is 2048. +The maximum number of events to bulk in a single {ls} request. The default is 2048. If the Beat sends single events, the events are collected into batches. If the Beat publishes a large batch of events (larger than the value specified by `bulk_max_size`), the batch is @@ -364,15 +375,15 @@ number of events to be contained in a batch. ===== `slow_start` -If enabled only a subset of events in a batch of events is transferred per transaction. +If enabled, only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to `bulk_max_size` if no error is encountered. -On error the number of events per transaction is reduced again. +On error, the number of events per transaction is reduced again. The default is `false`. ===== `backoff.init` -The number of seconds to wait before trying to reconnect to Logstash after +The number of seconds to wait before trying to reconnect to {ls} after a network error. After waiting `backoff.init` seconds, {beatname_uc} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. The @@ -381,4 +392,4 @@ default is 1s. ===== `backoff.max` The maximum number of seconds to wait before attempting to connect to -Logstash after a network error. The default is 60s. +{ls} after a network error. The default is 60s.