Semantic conventions for Summary metrics #2511

jmacd · 2022-04-25T21:15:46Z

Part of #2485

Changes

#2485 deals in specifying conventional names for metrics that have been traditionally pre-calculated about a distribution of measurements. When these measurements originate in Prometheus/OpenMetrics, we have a purpose-defined data point that can be used. When these measurements originate in a system with delta temporality or outside of Prometheus, they may exist in an "expanded form" as individual counter and gauge series.

This PR establishes semantic-conventional defaults for such pre-calculated summary metrics. Using these conventions, the semantic conventions for Kafka measurements can be written as "Histogram" while existing instrumentation may still output pre-calculated metrics in one of several ways. They can be expressed using the OTLP Summary point (if cumulative temporality), the OTLP Histogram data point (if delta temporality and no percentiles), or the expanded form (multiple metrics).

jmacd · 2022-04-26T16:51:21Z

@jack-berg This didn't make it into the meeting today, please take a look.

jack-berg · 2022-05-03T15:19:10Z

specification/metrics/semantic_conventions/README.md

+When reporting Metrics that have been previously calculated, in the
+case of metrics conventionally specified using `Histogram`
+instruments, consumers SHOULD support receiving OpenTelemetry
+`Summary` data points that were written by the external system.


This implies that the the consumer is able to know whether the metric was written by an external system.

An alternative would be to update the paragraph on 179-185 to indicate that consumers of Histogram instruments should accept Summaries, in addition to Explicit-boundary Histogram and Exponential Histogram.

jack-berg · 2022-05-03T15:29:52Z

specification/metrics/semantic_conventions/README.md

+use an existing OpenTelemetry or OpenMetrics data point definition MAY
+use an expanded form containing multiple metrics to convey
+precalculated information about a distribution of measurements named
+`H`.


Do you know of any system that would take advantage of this expanded form?

dashpole

I buy that vendors can provide compatibility between Histograms and Summaries to assist in migrations from legacy instrumentation, but I don't think we need to make statements about histogram/summary compatibility within OTel, as the data types are not currently translatable without data loss. I would prefer:

Expanding the summary type to accommodate additional sources of quantile-based aggregations (e.g. with avg, min, max, etc.)
Defining conventions (or a set of conventions) for converting from summary metrics to other types (a series of gauges and counters) for exporters which don't natively support summaries.
Leave conversion from individual series to summaries up to individual receivers.

dashpole · 2022-05-03T15:20:34Z

specification/metrics/semantic_conventions/README.md

+precalculated information about a distribution of measurements named
+`H`.
+
+- **H.sum** - A Counter (if monotonic) or an UpDownCounter (if non-monotonic) conveying the Sum of measurements in the distribution with the original units


Did you consider using OpenMetrics-style naming conventions instead of these? I had opened open-telemetry/opentelemetry-collector-contrib#7134 a while ago with a similar goal of unifying how summaries are split prior to exporting, but proposed using OpenMetrics-style naming conventions (_sum, _count, + quantile label).

Since most Summary metrics are originally OpenMetrics metrics, appending .sum to those metrics would generally look strange:

my_exciting_and_fun_metric.sum vs my_exciting_and_fun_metric_sum

dashpole · 2022-05-03T15:41:02Z

specification/metrics/semantic_conventions/README.md

+- **H.sum** - A Counter (if monotonic) or an UpDownCounter (if non-monotonic) conveying the Sum of measurements in the distribution with the original units
+- **H.count** - A Counter conveying the Count of measurements in the distribution (units is implied, e.g., "{measurements}")
+- **H.avg** - A Gauge conveying the average-value measurement in the distribution (original units)
+- **H.min** - A Gauge conveying the minimum-value measurement in the distribution (original units)


Summaries don't have a min/max/avg according to the metric datamodel. How would these conventions ever come into play? Do we need to expand the definition of Summary to accomodate these new fields if those are needed by other metric sources that produce quantiles?

dashpole · 2022-05-03T15:41:43Z

specification/metrics/semantic_conventions/README.md

+
+The Summary data point is defined exactly as in the OpenMetrics
+specification.  Semantic-conventional metrics specified for
+`Histogram` instruments MAY be expressed using Summary data points


Should this be SHOULD instead of MAY? A Histogram with just sum/count seems strictly less useful than a Summary, even if they share some properties.

dashpole · 2022-05-03T16:04:28Z

specification/metrics/semantic_conventions/README.md

+instead, as they have equivalent semantic interpretation, when they
+originate from OpenMetrics instrumentation.
+
+Receivers and exporters that handle precalculated metrics that cannot


I think we need to treat receivers and exporters differently here. Receivers (e.g. Prometheus receiver) isn't going to look for metrics with the naming patterns below, since it has an established naming convention for summaries already. I expect other receivers that can receive summary-like data will have different conventions that the receiver will need to adapt to.

For exporters, the idea of converting summaries into individual timeseries in a consistent way makes more sense to me as a thing we could specify.

github-actions · 2022-05-11T03:36:27Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

jmacd · 2022-05-17T21:59:46Z

This needs more consideration.

Joshua MacDonald added 3 commits April 25, 2022 14:07

Draft semantic conventions for Summary metrics

758ace3

move p back

9282bed

typo

ebcced0

jmacd mentioned this pull request Apr 25, 2022

Initial addition of Kafka metrics. #2485

Merged

jmacd marked this pull request as ready for review April 29, 2022 14:28

jmacd requested review from a team April 29, 2022 14:28

github-actions bot assigned yurishkuro Apr 29, 2022

jack-berg reviewed May 3, 2022

View reviewed changes

dashpole reviewed May 3, 2022

View reviewed changes

reyang added area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory labels May 3, 2022

github-actions bot added the Stale label May 11, 2022

jmacd closed this May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic conventions for Summary metrics #2511

Semantic conventions for Summary metrics #2511

jmacd commented Apr 25, 2022

jmacd commented Apr 26, 2022

jack-berg May 3, 2022

jack-berg May 3, 2022

dashpole left a comment

dashpole May 3, 2022

dashpole May 3, 2022

dashpole May 3, 2022

dashpole May 3, 2022

github-actions bot commented May 11, 2022

jmacd commented May 17, 2022

Semantic conventions for Summary metrics #2511

Semantic conventions for Summary metrics #2511

Conversation

jmacd commented Apr 25, 2022

Changes

jmacd commented Apr 26, 2022

jack-berg May 3, 2022

Choose a reason for hiding this comment

jack-berg May 3, 2022

Choose a reason for hiding this comment

dashpole left a comment

Choose a reason for hiding this comment

dashpole May 3, 2022

Choose a reason for hiding this comment

dashpole May 3, 2022

Choose a reason for hiding this comment

dashpole May 3, 2022

Choose a reason for hiding this comment

dashpole May 3, 2022

Choose a reason for hiding this comment

github-actions bot commented May 11, 2022

jmacd commented May 17, 2022