From 8c0857c477a1b8a900254c5d541effb42e30e6ae Mon Sep 17 00:00:00 2001 From: ruflin Date: Wed, 27 Sep 2017 12:13:19 +0200 Subject: [PATCH] Change field type of http header from nested to object The field type of http headers was set to nested instead of object. In metricbeat we normally do not used nested fields. Also nested fields are not compatible with the sorting on index time feature coming in 6.0. The problem with indexing the headers is that it could lead to field explosion if there are many different headers. An alternative would be to not index the headers. For now my recommendation is if someone has too many headers, filters should be used to remove most of the entries before it is sent to Elasticsearch. --- CHANGELOG.asciidoc | 1 + metricbeat/docs/fields.asciidoc | 4 ++-- metricbeat/docs/modules/http.asciidoc | 2 ++ metricbeat/module/http/_meta/docs.asciidoc | 2 ++ metricbeat/module/http/_meta/fields.yml | 4 ++-- 5 files changed, 9 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.asciidoc b/CHANGELOG.asciidoc index 623a20159e58..5db68a37fd11 100644 --- a/CHANGELOG.asciidoc +++ b/CHANGELOG.asciidoc @@ -59,6 +59,7 @@ https://github.com/elastic/beats/compare/v6.0.0-beta2...master[Check the HEAD di - Fix kubernetes events module to be able to index time fields properly. {issue}5093[5093] - The MongoDB module now connects on each fetch, to avoid stopping the whole Metricbeat instance if MongoDB is not up when starting. {pull}5120[5120] - Fixed `cmd_set` and `cmd_get` being mixed in the Memcache module. {pull}5189[5189] +- Change field type of http header from nested to object {pull}5258[5258] *Packetbeat* diff --git a/metricbeat/docs/fields.asciidoc b/metricbeat/docs/fields.asciidoc index 8ce16fa9a436..43462f51ae70 100644 --- a/metricbeat/docs/fields.asciidoc +++ b/metricbeat/docs/fields.asciidoc @@ -4039,7 +4039,7 @@ HTTP request information [float] === `http.request.header` -type: nested +type: object The HTTP headers sent @@ -4070,7 +4070,7 @@ HTTP response information [float] === `http.response.header` -type: nested +type: object The HTTP headers received diff --git a/metricbeat/docs/modules/http.asciidoc b/metricbeat/docs/modules/http.asciidoc index 46cd1f66aaa8..523fcdfbdbf5 100644 --- a/metricbeat/docs/modules/http.asciidoc +++ b/metricbeat/docs/modules/http.asciidoc @@ -13,6 +13,8 @@ This module is inspired by the Logstash https://www.elastic.co/guide/en/logstash This is often necessary in security restricted network setups, where Logstash is not able to reach all servers. Instead the server to be monitored itself has Metricbeat installed and can send the data or a collector server has Metricbeat installed which is deployed in the secured network environment and can reach all servers to be monitored. +NOTE: As the HTTP mertricsets also fetch headers, this can lead to lots of fields in Elasticsearch in case there are many different headers. If this is the case for you and you don't need the headers, we recommend to use processors to filter out the header field. + [float] === Example configuration diff --git a/metricbeat/module/http/_meta/docs.asciidoc b/metricbeat/module/http/_meta/docs.asciidoc index 3fd08235c2a9..100525148968 100644 --- a/metricbeat/module/http/_meta/docs.asciidoc +++ b/metricbeat/module/http/_meta/docs.asciidoc @@ -7,3 +7,5 @@ Multiple endpoints can be configured which are polled in a regular interval and This module is inspired by the Logstash https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http_poller.html[http_poller] input filter but doesn't require that the endpoint is reachable by Logstash as the Metricbeat module pushes the data to the configured output channels, e.g. Logstash or Elasticsearch. This is often necessary in security restricted network setups, where Logstash is not able to reach all servers. Instead the server to be monitored itself has Metricbeat installed and can send the data or a collector server has Metricbeat installed which is deployed in the secured network environment and can reach all servers to be monitored. + +NOTE: As the HTTP mertricsets also fetch headers, this can lead to lots of fields in Elasticsearch in case there are many different headers. If this is the case for you and you don't need the headers, we recommend to use processors to filter out the header field. diff --git a/metricbeat/module/http/_meta/fields.yml b/metricbeat/module/http/_meta/fields.yml index d158a5e62378..5f7601685634 100644 --- a/metricbeat/module/http/_meta/fields.yml +++ b/metricbeat/module/http/_meta/fields.yml @@ -13,7 +13,7 @@ HTTP request information fields: - name: header - type: nested + type: object description: > The HTTP headers sent - name: method @@ -30,7 +30,7 @@ HTTP response information fields: - name: header - type: nested + type: object description: > The HTTP headers received - name: status_code