Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Specify the mapping for Prometheus and Statsd exporters #118

Closed
wants to merge 4 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
226 changes: 226 additions & 0 deletions text/metrics/0118-statsd-and-prometheus-exporters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
# Specify standard treatment of OpenTelemetry aggregations in Prometheus and Statsd exporters

Specify behavior for Prometheus and Statsd exporters using standard
OpenTelemetry aggregations.

## Motivation

OpenTelemetry has specified a set of builtin Aggregators that can be
configured for use with metric instruments. The specification also
defines the default Aggregator that will be applied when an
Aggregation method is not otherwise configured. For some of the
Aggregators, there are multiple potential translations into existing
OSS systems. For example, a [MinMaxLastSumCount
Aggregator](https://github.com/open-telemetry/oteps/pull/117) can be
exposed in Prometheus as a Summary or as a Gauge, with the Gauge
format preferred.

This proposal specifies how to map OpenTelemetry Aggregators into
these OSS exposition formats, in order to support migration from
Prometheus and Statsd APIs without changing metrics protocols.
Specifying the data type correspondence is a necessary prerequisite
for migration from Prometheus and Statsd instruments to OpenTelemetry
instruments. The instrument migration is included to complete this
proposal.

For Counter and Gauge instruments, this proposal promises that the
default mapping from Prometheus and Statsd instrument to OpenTelemetry
instrument, paired with the default Aggregator and then mapped back
into Prometheus or Statsd, produces the corresponding Counter or Gauge
value type of the original Prometheus or Statsd instrument. The same
is not true for Histogram and Summary instruments: the recommended
OpenTelemetry instruments will export Counter or Gauge values by
default. It is expected that a configurable SDK or the Metric Views
API will be used to reconfigure selected OpenTelemetry instruments to
produce Histogram and Summary values in Prometheus and Statsd.

## Background

An Aggregator is an implementation of some logic to compute an
Aggregation, which is an exact or approximate summarization of a
series of metric events. Exporters translate Aggregation values to
into an exposition format, so the choice of Aggregator decides which
jmacd marked this conversation as resolved.
Show resolved Hide resolved
exposition formats are possible by the time data reaches an Exporter.

The standard [OpenTelemetry Aggregators (TODO: WIP
document)](https://github.com/open-telemetry/opentelemetry-specification/pull/347),
listed below, each support one or more Aggregations.

| OpenTelemetry Aggregator | Aggregations Supported |
| -- | -- |
| Sum | Sum |
| LastValue | LastValue |
| MinMaxLastSumCount | Sum, Count, Min, Max, LastValue |
| Histogram | Sum, Count, Histogram |
| Sketch | Sum, Count, Quantile |
| Exact | Sum, Count, Quantile, Points |

The standard OpenTelemetry Aggregators are required to be mergeable,
meaning that two or more Aggregators can be combined using a `Merge()`
operation to form a single summarization of the combined data. This
allows the metrics processor to generate either a _Cumulative_ value
(over all intervals) or a _Delta_ value (over one interval) of the
Aggregation on behalf of the Exporter.

The OTLP protocol has been designed as the standard exposition format
for OpenTelemetry libraries to forward data to the OpenTelemetry
collector, which is designed to process and re-export metric data.
When OpenTelemetry metric data is exposed through an Prometheus or
Statsd exporter, is it important that they produce the same result
whether data was exported directly or whether OTLP was used to forward
data to a collector.

OpenTelemetry metric instruments are classified in several ways:

- _Synchronous_: Synchronous instruments are called by the user (many times per interval), potentially have tracing context; asynchronous instruments are used through callbacks (once per interval).
- _Adding_ vs. _Grouping_: An adding instrument captures the sum of a number of measurements, while a grouping instrument captures a number of individual measurements. Grouping instruments are more expensive by nature.
- _Monotonic_: Adding instruments can be monotonic, indicating that the sum they express logically cannot decrease.
- _Precomputed-Sum_: Asynchronous adding instruments observe a sum directly, instead of a series of changes in the sum.

These properties will help understand how to map OpenTelemetry
Aggregators into Prometheus and Statsd metric data. These are the
OpenTelemetry instruments:

| Name | Synchronous | Adding or Grouping | Monotonic | Precomputed-Sum | Default Aggregator |
| ---- | ----------- | -------- | --------- | ---- | --- |
| Counter | Yes | Adding | Yes | No | Sum |
| UpDownCounter | Yes | Adding | No | No | Sum |
| ValueRecorder | Yes | Grouping | n/a | n/a | MinMaxLastSumCount |
| SumObserver | No | Adding | Yes | Yes | Sum |
| UpDownSumObserver | No | Adding | No | Yes | Sum |
| ValueObserver | No | Grouping | n/a | n/a | MinMaxLastSumCount |

Note that the Precomputed-Sum property places some constraints on the
combination of Aggregator and Exporter. To compute _Delta_
aggregations of Precomputed-Sum instruments requires an aggregation
that supports subtraction (which MUST include Sum and SHOULD include
Histogram).

### Prometheus

This is based off of the Prometheus [Metric
Types](https://prometheus.io/docs/concepts/metric_types).

#### Prometheus instruments

Prometheus Counter instruments are semantically identical to
OpenTelemetry Counter instruments, including the Mnotonic property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OpenTelemetry Counter instruments, including the Mnotonic property.
OpenTelemetry Counter instruments, including the Monotonic property.

They are exposed as a Cumulative Sum aggregation.

Prometheus Gauge instruments have a number of uses, depending on
whether they are used as an adding or as a grouping instrument. They
are exposed as a single data point equal to the the last value that
was set. Because Prometheus clients are stateful, Gauges support both
`Set()` and `Add()` methods. Generally, Prometheus Gauges used to
`Add()` map into OpenTelemetry UpDownCounter instruments (maybe
SumObserver, UpDownSumObserver), while Prometheus Gauges used to
`Set()` map into OpenTelemetry ValueRecorder instruments (maybe
ValueObserver).

Prometheus Histogram instruments are exposed as Cumulative
aggregations, defined by a number of bucketed counts, accumulated from
the start of the process. Prometheus Histogram instruments map into
ValueRecorder instruments (maybe ValueObserver). To configure a
Histogram exposition in the Prometheus exporter, configure a Histogram
Aggregator with the desired buckets.

Prometheus Summary instruments are discouraged, as mentioned above,
because they are not mergeable. Uses of Prometheus Summary
instruments map into OpenTelemetry ValueRecorder instruments (maybe
ValueObserver). To configure a Summary exposition in the Prometheus
exporter, configure a Sketch Aggregator with the desired quantiles.

#### OpenTelemetry Aggregator to Prometheus exposition

The following table lists the mapping from Aggregator to Prometheus
exposition format. The "Typical Instruments" listed are the
applicable OpenTelemetry instruments, for which the Prometheus mapping
is sensible.

| OpenTelemetry aggregator | Default Prometheus data type mapping | Export kind | Typical instruments | Notes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see potentially two different people reading this table: an OpenTelemetry developer writing an exporter for prometheus, and an existing Prometheus instrumentation author looking to migrate to OpenTelemetry instrumentation.

This table seems to serve the the former well, but I'm not sure it is clear for the latter as to what they need to do. Guessing there would like a more explicit mapping between Prometheus type and OTel Instrument. I'm guessing the one-to-many mapping might be a bit overwhelming.

Is this something we can assume Prometheus instrumentation writers can work out (i.e. become familiar with synchronicity, precomputed sums, and additive/grouping)?

Is this something for a different PR?

Could this be added in a subsequent table?

| -- | -- | -- | -- | -- |
| Sum (Monotonic) | Counter | Cumulative | Counter(*), SumObserver(*) | |
| Sum (Non-Monotonic) | Gauge | Cumulative | UpDownCounter(*), UpDownSumObserver(*) | |
| LastValue | Gauge | Cumulative | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | Delta | ValueRecorder(*), ValueObserver(*) | Expose the LastValue field as the Gauge |
Copy link
Member

@james-bebbington james-bebbington Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notes say that LastValue will be exposed as a Gauge. Is the idea that we will throw away the other non-LastValue (Min/Max/Sum/Count) aggregations by default?

I would have thought a more natural default would be for each of these aggregations to be represented as a separate Gauge, otherwise we are aggregating that data just to throw it away - i.e.

metricname_min
metricname_max
metricname_last
metricname_sum
metricname_count

That would obviously have some implications for the mapping from Prometheus -> OT -> Prometheus that I'm not too sure how to address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you propose is a viable export strategy, I just don't believe it's the one most Prometheus users want or expect. This is what the Prometheus "Summary" metric exports, approximately. If we replaced "metricname_last" with just "metricname" that would create less confusion for Prometheus users, but they might not appreciate the cost of indexing new metric names that they never wanted.

It would be more optimal, but much more complex, to try to match the default aggregation to the configured exporter. If an OTLP exporter is configured, then the question becomes recursive. What aggregation should I use to satisfy a downstream exporter? That would create a headache at startup.

The idea of adding an optimization (#117) to store a single value when there is in fact a single value, was meant to address some of the concern about throwing away information. I believe some platforms want to see min/max/sum/count and would not like to force users of those systems to reconfigure the aggregation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks for the detailed explanation

Comment on lines +142 to +145
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Sum (Monotonic) | Counter | Cumulative | Counter(*), SumObserver(*) | |
| Sum (Non-Monotonic) | Gauge | Cumulative | UpDownCounter(*), UpDownSumObserver(*) | |
| LastValue | Gauge | Cumulative | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | Delta | ValueRecorder(*), ValueObserver(*) | Expose the LastValue field as the Gauge |
| Sum (Monotonic) | Counter | Cumulative | Counter(\*), SumObserver(\*) | |
| Sum (Non-Monotonic) | Gauge | Cumulative | UpDownCounter(\*), UpDownSumObserver(\*) | |
| LastValue | Gauge | Cumulative | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | Delta | ValueRecorder(\*), ValueObserver(\*) | Expose the LastValue field as the Gauge |

| Histogram | Histogram | Cumulative | ValueRecorder, ValueObserver | |
| Sketch | Summary | Delta | ValueRecorder, ValueObserver | |
| Exact | Summary | Delta | ValueRecorder, ValueObserver | |

Above, (*) denotes the default behavior of an OpenTelemetry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Above, (*) denotes the default behavior of an OpenTelemetry
Above, (\*) denotes the default behavior of an OpenTelemetry

instrument.

### Statsd

Statsd refers to a wire format for metrics. The mapping specified here
refers to either the original Etsy protocol (a.k.a. plain Statsd) or
the DataDog variation with labels added (a.k.a. DogStatsd).

#### Statsd instruments
jmacd marked this conversation as resolved.
Show resolved Hide resolved

Statsd instruments export individual data points using messages
described by a 1- or 2- character string:

- "c" for Counter
- "g" for Gauge
- "h" for Histogram (exposed as a Summary)
- "ms" for Timing (exposed as a Summary)
- "d" for Distribution (exposed as a Sketch)

#### OpenTelemetry Aggregator to Statsd exposition

The Statsd exposition format always uses Delta aggregation.

Statsd Grouping instruments, which are all except the Statsd Counter
instrument, are exposed as Gauge by default (e.g., as opposed to
Comment on lines +174 to +175
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following the meaning here. Maybe?

Suggested change
Statsd Grouping instruments, which are all except the Statsd Counter
instrument, are exposed as Gauge by default (e.g., as opposed to
Statsd Grouping instruments which are all, except the Statsd Counter
instrument, exposed as Gauges by default (e.g., as opposed to a

Histogram).

The Histogram, Sketch and Exact Aggregators, when configured, is
exposed in Summary form, using the instrument name with `_sum`,
`_count`, and various quantile suffixes (e.g., `_p95`). The
MinMaxLastSumCount Aggregator also supports being exposed as a
Summary (e.g., `_min`, `_max` suffixes).

The following table lists the mapping from Aggregator to Statsd
exposition format. The "Typical Instruments" listed are the
applicable OpenTelemetry instruments, for which the Statsd mapping is
sensible.

| OpenTelemetry aggregator | Default Statsd data type mapping | Typical instruments | Notes |
| -- | -- | -- | -- |
| Sum (Monotonic) | Counter | Counter(*), SumObserver(*) | |
| Sum (Non-Monotonic) | Gauge | UpDownCounter(*), UpDownSumObserver(*) | |
| LastValue | Gauge | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | ValueRecorder(*), ValueObserver(*) | Expose the LastValue field as the Gauge |
| Histogram | Summary | ValueRecorder, ValueObserver | |
| Sketch | Summary | ValueRecorder, ValueObserver | |
| Exact | Summary | ValueRecorder, ValueObserver | |

Above, (*) denotes the default behavior of an OpenTelemetry
Comment on lines +191 to +199
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Sum (Monotonic) | Counter | Counter(*), SumObserver(*) | |
| Sum (Non-Monotonic) | Gauge | UpDownCounter(*), UpDownSumObserver(*) | |
| LastValue | Gauge | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | ValueRecorder(*), ValueObserver(*) | Expose the LastValue field as the Gauge |
| Histogram | Summary | ValueRecorder, ValueObserver | |
| Sketch | Summary | ValueRecorder, ValueObserver | |
| Exact | Summary | ValueRecorder, ValueObserver | |
Above, (*) denotes the default behavior of an OpenTelemetry
| Sum (Monotonic) | Counter | Counter(\*), SumObserver(\*) | |
| Sum (Non-Monotonic) | Gauge | UpDownCounter(\*), UpDownSumObserver(\*) | |
| LastValue | Gauge | ValueRecorder, ValueObserver, SumObserver, UpDownSumObserver | |
| MinMaxLastSumCount | Gauge | ValueRecorder(\*), ValueObserver(\*) | Expose the LastValue field as the Gauge |
| Histogram | Summary | ValueRecorder, ValueObserver | |
| Sketch | Summary | ValueRecorder, ValueObserver | |
| Exact | Summary | ValueRecorder, ValueObserver | |
Above, (\*) denotes the default behavior of an OpenTelemetry

instrument.

## Trade-offs and mitigations

When an OTLP Exporter is configured in the client, we can expect any
configured Aggregator to produce an Aggregation that maps into the
OTLP protocol, such that an OpenTelemetry collector is able to apply
the same logic as local exporter would. OpenTelemetry provides a
number of Aggregators to facilitate this kind of configuration choice.

When the default Aggregator is used with any of the OpenTelemetry
instruments (i.e., lacking other configuration), the result in
Prometheus or Statsd will be exposed as a Counter or as a Gauge.
There is no instrument choice with a default mapping to Histogram in
either system. This is specified in order to reduce the default cost
of OpenTelemetry instruments.

Note, in particular, that the default Aggregator for ValueRecorder and
ValueObserver is MinMaxLastSumCount, specified even though the default
exposition format is Gauge for both Prometheus and Statsd systems.
This is done so that metric exporters other than Prometheus and Statsd
are able to summarize the distribution by default (i.e., expose min,
max, sum, and count), after forwarding through OTLP. See the
[MinMaxLastSumCount
OTEP](https://github.com/open-telemetry/oteps/pull/117). A drawback
of this approach is slightly more computation and data transfered
through over OTLP.