Skip to content

Commit

Permalink
Add deriving metrics from logs use case to Data Prepper (opensearch-p…
Browse files Browse the repository at this point in the history
…roject#6248)

* Add use case to Data Prepper

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Add content

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Copy edits

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update metrics-logs.md

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Co-authored-by: David Venable <dlv@amazon.com>
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Co-authored-by: David Venable <dlv@amazon.com>
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update metrics-logs.md

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update metrics-logs.md

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Update _data-prepper/common-use-cases/metrics-logs.md

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

---------

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>
Co-authored-by: David Venable <dlv@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Sander van de Geijn <sandervandegeijn@icloud.com>
  • Loading branch information
3 people authored and sandervandegeijn committed Jul 30, 2024
1 parent 089f9dd commit d6cb1da
Showing 1 changed file with 70 additions and 0 deletions.
70 changes: 70 additions & 0 deletions _data-prepper/common-use-cases/metrics-logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
layout: default
title: Deriving metrics from logs
parent: Common use cases
nav_order: 15
---

# Deriving metrics from logs

You can use Data Prepper to derive metrics from logs.

The following example pipeline receives incoming logs using the [`http` source plugin]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/http-source) and the [`grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/). It then uses the [`aggregate` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/aggregate/) to extract the metric bytes aggregated during a 30-second window and derives histograms from the results.

This pipeline writes data to two different OpenSearch indexes:

- `logs`: This index stores the original, un-aggregated log events after being processed by the `grok` processor.
- `histogram_metrics`: This index stores the derived histogram metrics extracted from the log events using the `aggregate` processor.

The pipeline contains two sub-pipelines:

- `apache-log-pipeline-with-metrics`: Receives logs through an HTTP client like FluentBit, using `grok` to extract important values from the logs by matching the value in the log key against the [Apache Common Log Format](https://httpd.apache.org/docs/2.4/logs.html#accesslog). It then forwards the grokked logs to two destinations:

- An OpenSearch index named `logs` to store the original log events.
- The `log-to-metrics-pipeline` for further aggregation and metric derivation.

- `log-to-metrics-pipeline`: Receives the grokked logs from the `apache-log-pipeline-with-metrics` pipeline, aggregates the logs, and derives histogram metrics of bytes based on the values in the `clientip` and `request` keys. Finally, it sends the derived histogram metrics to an OpenSearch index named `histogram_metrics`.

#### Example pipeline

```json
apache-log-pipeline-with-metrics:
source:
http:
# Provide the path for ingestion. ${pipelineName} will be replaced with pipeline name configured for this pipeline.
# In this case it would be "/apache-log-pipeline-with-metrics/logs". This will be the FluentBit output URI value.
path: "/${pipelineName}/logs"
processor:
- grok:
match:
log: [ "%{COMMONAPACHELOG_DATATYPED}" ]
sink:
- opensearch:
...
index: "logs"
- pipeline:
name: "log-to-metrics-pipeline"

log-to-metrics-pipeline:
source:
pipeline:
name: "apache-log-pipeline-with-metrics"
processor:
- aggregate:
# Specify the required identification keys
identification_keys: ["clientip", "request"]
action:
histogram:
# Specify the appropriate values for each of the following fields
key: "bytes"
record_minmax: true
units: "bytes"
buckets: [0, 25000000, 50000000, 75000000, 100000000]
# Pick the required aggregation period
group_duration: "30s"
sink:
- opensearch:
...
index: "histogram_metrics"
```
{% include copy-curl.html %}

0 comments on commit d6cb1da

Please sign in to comment.