Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic conventions for Summary metrics #2511

Closed
wants to merge 3 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Apr 25, 2022

Part of #2485

Changes

#2485 deals in specifying conventional names for metrics that have been traditionally pre-calculated about a distribution of measurements. When these measurements originate in Prometheus/OpenMetrics, we have a purpose-defined data point that can be used. When these measurements originate in a system with delta temporality or outside of Prometheus, they may exist in an "expanded form" as individual counter and gauge series.

This PR establishes semantic-conventional defaults for such pre-calculated summary metrics. Using these conventions, the semantic conventions for Kafka measurements can be written as "Histogram" while existing instrumentation may still output pre-calculated metrics in one of several ways. They can be expressed using the OTLP Summary point (if cumulative temporality), the OTLP Histogram data point (if delta temporality and no percentiles), or the expanded form (multiple metrics).

@jmacd
Copy link
Contributor Author

jmacd commented Apr 26, 2022

@jack-berg This didn't make it into the meeting today, please take a look.

@jmacd jmacd marked this pull request as ready for review April 29, 2022 14:28
@jmacd jmacd requested review from a team April 29, 2022 14:28
When reporting Metrics that have been previously calculated, in the
case of metrics conventionally specified using `Histogram`
instruments, consumers SHOULD support receiving OpenTelemetry
`Summary` data points that were written by the external system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that the the consumer is able to know whether the metric was written by an external system.

An alternative would be to update the paragraph on 179-185 to indicate that consumers of Histogram instruments should accept Summaries, in addition to Explicit-boundary Histogram and Exponential Histogram.

use an existing OpenTelemetry or OpenMetrics data point definition MAY
use an expanded form containing multiple metrics to convey
precalculated information about a distribution of measurements named
`H`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know of any system that would take advantage of this expanded form?

Copy link
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that vendors can provide compatibility between Histograms and Summaries to assist in migrations from legacy instrumentation, but I don't think we need to make statements about histogram/summary compatibility within OTel, as the data types are not currently translatable without data loss. I would prefer:

  1. Expanding the summary type to accommodate additional sources of quantile-based aggregations (e.g. with avg, min, max, etc.)
  2. Defining conventions (or a set of conventions) for converting from summary metrics to other types (a series of gauges and counters) for exporters which don't natively support summaries.
  3. Leave conversion from individual series to summaries up to individual receivers.

precalculated information about a distribution of measurements named
`H`.

- **H.sum** - A Counter (if monotonic) or an UpDownCounter (if non-monotonic) conveying the Sum of measurements in the distribution with the original units
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider using OpenMetrics-style naming conventions instead of these? I had opened open-telemetry/opentelemetry-collector-contrib#7134 a while ago with a similar goal of unifying how summaries are split prior to exporting, but proposed using OpenMetrics-style naming conventions (_sum, _count, + quantile label).

Since most Summary metrics are originally OpenMetrics metrics, appending .sum to those metrics would generally look strange:

my_exciting_and_fun_metric.sum vs my_exciting_and_fun_metric_sum

- **H.sum** - A Counter (if monotonic) or an UpDownCounter (if non-monotonic) conveying the Sum of measurements in the distribution with the original units
- **H.count** - A Counter conveying the Count of measurements in the distribution (units is implied, e.g., "{measurements}")
- **H.avg** - A Gauge conveying the average-value measurement in the distribution (original units)
- **H.min** - A Gauge conveying the minimum-value measurement in the distribution (original units)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summaries don't have a min/max/avg according to the metric datamodel. How would these conventions ever come into play? Do we need to expand the definition of Summary to accomodate these new fields if those are needed by other metric sources that produce quantiles?


The Summary data point is defined exactly as in the OpenMetrics
specification. Semantic-conventional metrics specified for
`Histogram` instruments MAY be expressed using Summary data points
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be SHOULD instead of MAY? A Histogram with just sum/count seems strictly less useful than a Summary, even if they share some properties.

instead, as they have equivalent semantic interpretation, when they
originate from OpenMetrics instrumentation.

Receivers and exporters that handle precalculated metrics that cannot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to treat receivers and exporters differently here. Receivers (e.g. Prometheus receiver) isn't going to look for metrics with the naming patterns below, since it has an established naming convention for summaries already. I expect other receivers that can receive summary-like data will have different conventions that the receiver will need to adapt to.

For exporters, the idea of converting summaries into individual timeseries in a consistent way makes more sense to me as a thing we could specify.

@reyang reyang added area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory labels May 3, 2022
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 11, 2022
@jmacd
Copy link
Contributor Author

jmacd commented May 17, 2022

This needs more consideration.

@jmacd jmacd closed this May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants