-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Clean up main Ingest Management doc directory (#68785)
- Loading branch information
Showing
7 changed files
with
211 additions
and
210 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,208 +1,14 @@ | ||
[chapter] | ||
[role="xpack"] | ||
[[epm]] | ||
== Ingest Manager | ||
[[xpack-ingest-manager]] | ||
= Ingest Manager | ||
|
||
These are the docs for the Ingest Manager. | ||
The {ingest-manager} app in Kibana enables you to add and manage integrations for popular services and platforms, as well as manage {elastic-agent} installations in standalone or {fleet} mode. | ||
|
||
[role="screenshot"] | ||
image::ingest_manager/images/ingest-manager-start.png[Ingest Manager App in Kibana] | ||
|
||
=== Configuration | ||
[float] | ||
=== Get started | ||
|
||
The Elastic Package Manager by default access `epr.elastic.co` to retrieve the package. The url can be configured with: | ||
|
||
``` | ||
xpack.epm.registryUrl: 'http://localhost:8080' | ||
``` | ||
|
||
=== API | ||
|
||
The Package Manager offers an API. Here an example on how they can be used. | ||
|
||
List installed packages: | ||
|
||
``` | ||
curl localhost:5601/api/ingest_manager/epm/packages | ||
``` | ||
|
||
Install a package: | ||
|
||
``` | ||
curl -X POST localhost:5601/api/ingest_manager/epm/packages/iptables-1.0.4 | ||
``` | ||
|
||
Delete a package: | ||
|
||
``` | ||
curl -X DELETE localhost:5601/api/ingest_manager/epm/packages/iptables-1.0.4 | ||
``` | ||
|
||
=== Definitions | ||
|
||
This section is to define terms used across ingest management. | ||
|
||
==== Data Source | ||
|
||
A data source is a definition on how to collect data from a service, for example `nginx`. A data source contains | ||
definitions for one or multiple inputs and each input can contain one or multiple streams. | ||
|
||
With the example of the nginx Data Source, it contains to inputs: `logs` and `nginx/metrics`. Logs and metrics are collected | ||
differently. The `logs` input contains two streams, `access` and `error`, the `nginx/metrics` input contains the stubstatus stream. | ||
|
||
|
||
==== Data Stream | ||
|
||
Data Streams are a [new concept](https://github.com/elastic/elasticsearch/issues/53100) in Elasticsearch which simplify | ||
ingesting data and the setup of Elasticsearch. | ||
|
||
==== Elastic Agent | ||
|
||
A single, unified agent that users can deploy to hosts or containers. It controls which data is collected from the host or containers and where the data is sent. It will run Beats, Endpoint or other monitoring programs as needed. It can operate standalone or pull a configuration policy from Fleet. | ||
|
||
|
||
==== Elastic Package Registry | ||
|
||
The Elastic Package Registry (EPR) is a service which runs under [https://epr.elastic.co]. It serves the packages through its API. | ||
More details about the registry can be found [here](https://github.com/elastic/package-registry). | ||
|
||
==== Fleet | ||
|
||
Fleet is the part of the Ingest Manager UI in Kibana that handles the part of enrolling Elastic Agents, | ||
managing agents and sending configurations to the Elastic Agent. | ||
|
||
==== Indexing Strategy | ||
|
||
Ingest Management + Elastic Agent follow a strict new indexing strategy: `{type}-{dataset}-{namespace}`. An example | ||
for this is `logs-nginx.access-default`. More details about it can be found in the Index Strategy below. All data of | ||
the index strategy is sent to Data Streams. | ||
|
||
==== Input | ||
|
||
An input is the configuration unit in an Agent Config that defines the options on how to collect data from | ||
an endpoint. This could be username / password which are need to authenticate with a service or a host url | ||
as an example. | ||
|
||
An input is part of a Data Source and contains streams. | ||
|
||
==== Integration | ||
|
||
An integration is a package with the type integration. An integration package has at least 1 data source | ||
and usually collects data from / about a service. | ||
|
||
|
||
==== Namespace | ||
|
||
A user-specified string that will be used to part of the index name in Elasticsearch. It helps users identify logs coming from a specific environment (like prod or test), an application, or other identifiers. | ||
|
||
|
||
==== Package | ||
|
||
A package contains all the assets for the Elastic Stack. A more detailed definition of a | ||
package can be found under https://github.com/elastic/package-registry. | ||
|
||
Besides the assets, a package contains the data source definitions with its inputs and streams. | ||
|
||
==== Stream | ||
|
||
A stream is a configuration unit in the Elastic Agent config. A stream is part of an input and defines how the data | ||
fetched by this input should be processed and which Data Stream to send it to. | ||
|
||
== Indexing Strategy | ||
|
||
Ingest Management enforces an indexing strategy to allow the system to automatically detect indices and run queries on it. In short the indexing strategy looks as following: | ||
|
||
``` | ||
{dataset.type}-{dataset.name}-{dataset.namespace} | ||
``` | ||
|
||
The `{dataset.type}` can be `logs` or `metrics`. The `{dataset.namespace}` is the part where the user can use free form. The only two requirement are that it has only characters allowed in an Elasticsearch index name and does NOT contain a `-`. The `dataset` is defined by the data that is indexed. The same requirements as for the namespace apply. It is expected that the fields for type, namespace and dataset are part of each event and are constant keywords. If there is a dataset or a namespace with a `-` inside, it is recommended to replace it either by a `.` or a `_`. | ||
|
||
Note: More `{dataset.type}`s might be added in the future like `traces`. | ||
|
||
This indexing strategy has a few advantages: | ||
|
||
* Each index contains only the fields which are relevant for the dataset. This leads to more dense indices and better field completion. | ||
* ILM policies can be applied per namespace per dataset. | ||
* Rollups can be specified per namespace per dataset. | ||
* Having the namespace user configurable makes setting security permissions possible. | ||
* Having a global metrics and logs template, allows to create new indices on demand which still follow the convention. This is common in the case of k8s as an example. | ||
* Constant keywords allow to narrow down the indices we need to access for querying very efficiently. This is especially relevant in environments which a large number of indices or with indices on slower nodes. | ||
|
||
Overall it creates smaller indices in size, makes querying more efficient and allows users to define their own naming parts in namespace and still benefiting from all features that can be built on top of the indexing startegy. | ||
|
||
=== Ingest Pipeline | ||
|
||
The ingest pipelines for a specific dataset will have the following naming scheme: | ||
|
||
``` | ||
{dataset.type}-{dataset.name}-{package.version} | ||
``` | ||
|
||
As an example, the ingest pipeline for the Nginx access logs is called `logs-nginx.access-3.4.1`. The same ingest pipeline is used for all namespaces. It is possible that a dataset has multiple ingest pipelines in which case a suffix is added to the name. | ||
|
||
The version is included in each pipeline to allow upgrades. The pipeline itself is listed in the index template and is automatically applied at ingest time. | ||
|
||
=== Templates & ILM Policies | ||
|
||
To make the above strategy possible, alias templates are required. For each type there is a basic alias template with a default ILM policy. These default templates apply to all indices which follow the indexing strategy and do not have a more specific dataset alias template. | ||
|
||
The `metrics` and `logs` alias template contain all the basic fields from ECS. | ||
|
||
Each type template contains an ILM policy. Modifying this default ILM policy will affect all data covered by the default templates. | ||
|
||
The templates for a dataset are called as following: | ||
|
||
``` | ||
{dataset.type}-{dataset.name} | ||
``` | ||
|
||
The pattern used inside the index template is `{type}-{dataset}-*` to match all namespaces. | ||
|
||
=== Defaults | ||
|
||
If the Elastic Agent is used to ingest data and only the type is specified, `default` for the namespace is used and `generic` for the dataset. | ||
|
||
=== Data filtering | ||
|
||
Filtering for data in queries for example in visualizations or dashboards should always be done on the constant keyword fields. Visualizations needing data for the nginx.access dataset should query on `type:logs AND dataset:nginx.access`. As these are constant keywords the prefiltering is very efficient. | ||
|
||
=== Security permissions | ||
|
||
Security permissions can be set on different levels. To set special permissions for the access on the prod namespace, use the following index pattern: | ||
|
||
``` | ||
/(logs|metrics)-[^-]+-prod-$/ | ||
``` | ||
|
||
To set specific permissions on the logs index, the following can be used: | ||
|
||
``` | ||
/^(logs|metrics)-.*/ | ||
``` | ||
|
||
Todo: The above queries need to be tested. | ||
|
||
|
||
|
||
== Package Manager | ||
|
||
=== Package Upgrades | ||
|
||
When upgrading a package between a bugfix or a minor version, no breaking changes should happen. Upgrading a package has the following effect: | ||
|
||
* Removal of existing dashboards | ||
* Installation of new dashboards | ||
* Write new ingest pipelines with the version | ||
* Write new Elasticsearch alias templates | ||
* Trigger a rollover for all the affected indices | ||
|
||
The new ingest pipeline is expected to still work with the data coming from older configurations. In most cases this means some of the fields can be missing. For this to work, each event must contain the version of config / package it is coming from to make such a decision. | ||
|
||
In case of a breaking change in the data structure, the new ingest pipeline is also expected to deal with this change. In case there are breaking changes which cannot be dealt with in an ingest pipeline, a new package has to be created. | ||
|
||
Each package lists its minimal required agent version. In case there are agents enrolled with an older version, the user is notified to upgrade these agents as otherwise the new configs cannot be rolled out. | ||
|
||
=== Generated assets | ||
|
||
When a package is installed or upgraded, certain Kibana and Elasticsearch assets are generated from . These follow the naming conventions explained above (see "indexing strategy") and contain configuration for the elastic stack that makes ingesting and displaying data work with as little user interaction as possible. | ||
|
||
* link:index-templates.asciidoc[Elasticsearch Index Templates] | ||
* Kibana Index Patterns | ||
To get started with Ingest Management, refer to the LINK_TO_INGEST_MANAGEMENT_GUIDE[Ingest Management Guide]. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
This document is part of the original drafts for ingest management documentation in `docs/ingest_manager` and may be outdated. | ||
Overall documentation of Ingest Management is now maintained in the `elastic/stack-docs` repository. | ||
|
||
# Elastic Package Manager API | ||
|
||
The Package Manager offers an API. Here an example on how they can be used. | ||
|
||
List installed packages: | ||
|
||
``` | ||
curl localhost:5601/api/ingest_manager/epm/packages | ||
``` | ||
|
||
Install a package: | ||
|
||
``` | ||
curl -X POST localhost:5601/api/ingest_manager/epm/packages/iptables-1.0.4 | ||
``` | ||
|
||
Delete a package: | ||
|
||
``` | ||
curl -X DELETE localhost:5601/api/ingest_manager/epm/packages/iptables-1.0.4 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
This document is part of the original drafts for ingest management documentation in `docs/ingest_manager` and may be outdated. | ||
Overall documentation of Ingest Management is now maintained in the `elastic/stack-docs` repository. | ||
|
||
# Ingest Management Definitions | ||
|
||
This section is to define terms used across ingest management. | ||
|
||
## Data Source | ||
|
||
A data source is a definition on how to collect data from a service, for example `nginx`. A data source contains | ||
definitions for one or multiple inputs and each input can contain one or multiple streams. | ||
|
||
With the example of the nginx Data Source, it contains to inputs: `logs` and `nginx/metrics`. Logs and metrics are collected | ||
differently. The `logs` input contains two streams, `access` and `error`, the `nginx/metrics` input contains the stubstatus stream. | ||
|
||
|
||
## Data Stream | ||
|
||
Data Streams are a [new concept](https://github.com/elastic/elasticsearch/issues/53100) in Elasticsearch which simplify | ||
ingesting data and the setup of Elasticsearch. | ||
|
||
## Elastic Agent | ||
|
||
A single, unified agent that users can deploy to hosts or containers. It controls which data is collected from the host or containers and where the data is sent. It will run Beats, Endpoint or other monitoring programs as needed. It can operate standalone or pull a configuration policy from Fleet. | ||
|
||
|
||
## Elastic Package Registry | ||
|
||
The Elastic Package Registry (EPR) is a service which runs under [https://epr.elastic.co]. It serves the packages through its API. | ||
More details about the registry can be found [here](https://github.com/elastic/package-registry). | ||
|
||
## Fleet | ||
|
||
Fleet is the part of the Ingest Manager UI in Kibana that handles the part of enrolling Elastic Agents, | ||
managing agents and sending configurations to the Elastic Agent. | ||
|
||
## Indexing Strategy | ||
|
||
Ingest Management + Elastic Agent follow a strict new indexing strategy: `{type}-{dataset}-{namespace}`. An example | ||
for this is `logs-nginx.access-default`. More details about it can be found in the Index Strategy below. All data of | ||
the index strategy is sent to Data Streams. | ||
|
||
## Input | ||
|
||
An input is the configuration unit in an Agent Config that defines the options on how to collect data from | ||
an endpoint. This could be username / password which are need to authenticate with a service or a host url | ||
as an example. | ||
|
||
An input is part of a Data Source and contains streams. | ||
|
||
## Integration | ||
|
||
An integration is a package with the type integration. An integration package has at least 1 data source | ||
and usually collects data from / about a service. | ||
|
||
## Namespace | ||
|
||
A user-specified string that will be used to part of the index name in Elasticsearch. It helps users identify logs coming from a specific environment (like prod or test), an application, or other identifiers. | ||
|
||
## Package | ||
|
||
A package contains all the assets for the Elastic Stack. A more detailed definition of a | ||
package can be found under https://github.com/elastic/package-registry. | ||
|
||
Besides the assets, a package contains the data source definitions with its inputs and streams. | ||
|
||
## Stream | ||
|
||
A stream is a configuration unit in the Elastic Agent config. A stream is part of an input and defines how the data | ||
fetched by this input should be processed and which Data Stream to send it to. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
This document is part of the original drafts for ingest management documentation in `docs/ingest_manager` and may be outdated. | ||
Overall documentation of Ingest Management is now maintained in the `elastic/stack-docs` repository. | ||
|
||
# Package Upgrades | ||
|
||
When upgrading a package between a bugfix or a minor version, no breaking changes should happen. Upgrading a package has the following effect: | ||
|
||
* Removal of existing dashboards | ||
* Installation of new dashboards | ||
* Write new ingest pipelines with the version | ||
* Write new Elasticsearch alias templates | ||
* Trigger a rollover for all the affected indices | ||
|
||
The new ingest pipeline is expected to still work with the data coming from older configurations. In most cases this means some of the fields can be missing. For this to work, each event must contain the version of config / package it is coming from to make such a decision. | ||
|
||
In case of a breaking change in the data structure, the new ingest pipeline is also expected to deal with this change. In case there are breaking changes which cannot be dealt with in an ingest pipeline, a new package has to be created. | ||
|
||
Each package lists its minimal required agent version. In case there are agents enrolled with an older version, the user is notified to upgrade these agents as otherwise the new configs cannot be rolled out. | ||
|
||
# Generated assets | ||
|
||
When a package is installed or upgraded, certain Kibana and Elasticsearch assets are generated from . These follow the naming conventions explained above (see "indexing strategy") and contain configuration for the elastic stack that makes ingesting and displaying data work with as little user interaction as possible. | ||
|
||
## Elasticsearch Index Templates | ||
|
||
### Generation | ||
|
||
* Index templates are generated from `YAML` files contained in the package. | ||
* There is one index template per dataset. | ||
* For the generation of an index template, all `yml` files contained in the package subdirectory `dataset/DATASET_NAME/fields/` are used. |
Oops, something went wrong.