Promtail generates very high cardinality prometheus metrics #5553

m1keil · 2022-03-07T05:42:33Z

Describe the bug
We use consulagent_sd_configs scrape config to automatically detect and tail logs in our instances. These run Nomad clients that schedule all the workloads.
While inspecting the metrics exported by Promtail via /metrics I can see large sets of data points that are labelled with the file path that is being tracked by Promtail:

$ curl --silent 172.30.61.148:26017/metrics | grep "^promtail_" | sort | grep -E 'path|filename' | cut -d{ -f1 | uniq -c
     22 promtail_file_bytes_total
    153 promtail_log_entries_bytes_bucket
     17 promtail_log_entries_bytes_count
     17 promtail_log_entries_bytes_sum
     22 promtail_read_bytes_total
     17 promtail_read_lines_total
      2 promtail_stream_lag_seconds

I have only 5 services running on this node, but this generates up to 153 different labels sets (due to the metric being a summary/histogram).
Each service can have:

stdout & stderr log file
multiple log files (log rotation)
different allocation id (a unique ID to Nomad scheduling and execution)

Here's a verbose example:

promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.10"} 1.0478718e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.11"} 1.0482271e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.12"} 2.391382e+06
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.3"} 1.0482602e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.4"} 1.0476908e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.5"} 1.048399e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.6"} 1.0485677e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.7"} 1.04807e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.8"} 1.0482844e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stderr.9"} 1.0482479e+07
promtail_read_bytes_total{path="/opt/nomad/alloc/3d2e6c91-b769-38be-8910-094fb6510e36/alloc/logs/app.stdout.0"} 0

To make matters worse, the final number of the possible label values is amplified by N nodes that run Nomad.
This can generate so many labels that it can bring a decent Prometheus cluster to its knees.

As far as I can tell, there is no way to disable this detailed monitoring and the only option I have at the moment would be to drop these metrics at scrape time. It would be useful to have such detailed monitoring enabled by demand and not by default (---server.detailed-instrumentation).

The text was updated successfully, but these errors were encountered:

dannykopping · 2022-03-07T06:35:50Z

Thanks for reporting this @m1keil.

It's a fair point, but I don't think we will make the change you're suggesting and export metrics of different verbosity based on a flag. It adds some complexity and begs the unanswerable question: "what is a detailed metric?". We can't answer this for everyone in every scenario, so we'd prefer to export everything and have users filter out what they don't need.

On that note:
I would suggest adding a match stage to drop the metrics for rotated files.

How does that sound?

m1keil · 2022-03-07T07:31:22Z

We use match stage but I don't see how can it help here given that:

The log rotation implementation in nomad is implemented by creating a new file and raising the index by 1. I.e the latest log file is the one with the highest index, not necessery 0.
Doesn't address the high cardinality issue with allocation ID (the UUID in the path).

If there's something I'm missing here let me know.

We can't answer this for everyone in every scenario, so we'd prefer to export everything and have users filter out what they don't need.

I guess it's fair enough but I do think you need to consider sane defaults from promtail first. High cardinality prometheus metrics is a known issue that should be taken with care. Some people might enable this without realising this is a problem.
There is no harm done by exposing less by default and only exposing everything by explicit demand but the other way isn't true.

dannykopping · 2022-03-07T07:38:54Z

OK, then I think you need to drop the metrics at scrape time; I can't think of an alternative solution for you.

There is no harm done by exposing less by default and only exposing everything by explicit demand but the other way isn't true.

The problem is choosing what to cut out. You never know what metric would be critically useful to an operator.

In this case, Promtail could not expose any fewer metrics without understanding the specifics of your system.
I think it is the responsibility of the operator to ensure the health of their own Prometheus instance.

m1keil · 2022-03-07T07:52:20Z

The problem is choosing what to cut out.

I think in this case the answer can be simple, detailed = potentially unbound cardinality.

This is a common idiom in Prometheus world, for e.g, cAdvisor and node_exporter won't enable all of the metadata or metrics by default.

I think it is the responsibility of the operator to ensure the health of their own Prometheus instance.

This is true regardless of how promtail decided to run by default. But sane defaults can prevent mistakes that operators (will) make. The same way we don't ship software to prod with DEBUG logging.

dannykopping · 2022-03-07T08:05:57Z

We're missing each other on a fundamental point:
If promtail didn't expose metrics with potential unbounded cardinality, we wouldn't be able to expose any metrics with file paths.

m1keil · 2022-03-07T08:40:27Z

Well, yes, but I'm not saying don't allow exposing at all, all I'm saying is "maybe you shouldn't do that by default".

Looking at the existing metric per file:

promtail_file_bytes_total: docs say it's the number of bytes read but I think it's the actual file size ?
promtail_read_bytes_total: bytes read (per file)
promtail_read_lines_total: line read (per file)
promtail_stream_lag_seconds: undocumented
promtail_log_entries_bytes: I guess this is a distribution of log line sizes per file (?)

Now, are these metrics helpful? Probably. Are they so important to be on? Maybe a subset. It would be nice if
promtail_read_bytes_total, promtail_read_lines_total or promtail_stream_lag_seconds had "total" (non per-file) version. But I do understand this adds troubles in code.

Maybe as a compromise, we can at least add a note about the potential high cardinality in the docs?

dannykopping · 2022-03-07T08:43:17Z

Seems like a reasonable compromise to me.

Can you submit the PR? I want to give you credit for this suggestion.

dannykopping · 2022-03-08T08:38:11Z

Closing this one, we can reference it in the PR 👍

* Hint about potential high cardinality metrics Add warning about potential high cardinality metrics due to the file path being included in some of the metrics. Depending on the setup, such config can result in a very large amount of label values, which is not recommended by Prometheus. This can result in increased storage, slow down of queries, extra costs and so on. Address what has been discussed in #5553. * Reword observability.md Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com>

dannykopping closed this as completed Mar 8, 2022

m1keil mentioned this issue Mar 11, 2022

Docs: Hint about potential high cardinality metrics #5603

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Promtail generates very high cardinality prometheus metrics #5553

Promtail generates very high cardinality prometheus metrics #5553

m1keil commented Mar 7, 2022 •

edited

Loading

dannykopping commented Mar 7, 2022

m1keil commented Mar 7, 2022

dannykopping commented Mar 7, 2022 •

edited

Loading

m1keil commented Mar 7, 2022 •

edited

Loading

dannykopping commented Mar 7, 2022

m1keil commented Mar 7, 2022

dannykopping commented Mar 7, 2022

dannykopping commented Mar 8, 2022

Promtail generates very high cardinality prometheus metrics #5553

Promtail generates very high cardinality prometheus metrics #5553

Comments

m1keil commented Mar 7, 2022 • edited Loading

dannykopping commented Mar 7, 2022

m1keil commented Mar 7, 2022

dannykopping commented Mar 7, 2022 • edited Loading

m1keil commented Mar 7, 2022 • edited Loading

dannykopping commented Mar 7, 2022

m1keil commented Mar 7, 2022

dannykopping commented Mar 7, 2022

dannykopping commented Mar 8, 2022

m1keil commented Mar 7, 2022 •

edited

Loading

dannykopping commented Mar 7, 2022 •

edited

Loading

m1keil commented Mar 7, 2022 •

edited

Loading