Expand event-model description to more clearly delineate instrument v… #1614

jsuereth · 2021-04-13T14:21:11Z

…s. metric stream and some reasoning behind it.

Fixes ##1366

Changes

Improve the documentation around the "Event Model" and the relationship of "Instruments" to "metrics"

…s. metric stream and some reasoning behind it.

specification/metrics/datamodel.md

punya · 2021-04-13T14:35:10Z

specification/metrics/datamodel.md

+- Adding vs. Grouping aggregation.
+  - Adding instruments express a sum.  All points recorded via this instrument
+    are parts of a whole.
+  - Grouping instruments characterize a group of measurements.  All points


Where does this leave things like quantiles?

Quantile is a form of grouping.

I still believe this is a useful concept, but I am worried we've lost the connection to the API design by now. There's something about how Gauge and Histogram instruments are the same semantics with different default aggregations: both group individual measurements.

The point is that Gauge and Histogram inputs are semantically different than Sum inputs, because of individuality. This is meant to help the user choose between Counter and Histogram, for example. @reyang I mean to follow up on last week's API/SDK SIG meeting, in which we discussed some of this.

The goal of this is NOT to define what instruments the API uses, but show the flexibility of the model to adapt to differing instruments.

I'm thinking about doing the following:

Focus on adapting instruments (OTel + non-otel) into the data model

Focus on general concepts that may be used (quantiles, grouping, etc.)

Direct to the API specification for specific instruments / instrument names.

This specification should be about WHAT metrics mean, not what instruments are available in the API.

specification/metrics/datamodel.md

' '

specification/metrics/datamodel.md

SergeyKanzhelev · 2021-04-17T00:21:50Z

specification/metrics/datamodel.md

+other system.  OpenTelemetry metrics are designed such that the same instrument
+and events can be used in different ways to generate metric streams.
+
+![Events → Streams](img/model-event-layer.png)


does total_latency aggregation ever makes sense? It may not be the best idea to include example like this

I do want an example that shows how a single instrument could turn into all of Sum, Histogram + Gauge.

While I can synthesize hair-brained scenarios where I think "total latency" might make sense (e.g. ridiculous statistics like How many days have users waited for our website in aggregate), You're right it's not a super useful example. Going to take a day to brainstorm and looking for other ideas on this example.

If the example was a count (e.g., request_size), then max, sum, and histogram outputs all make sense.

However, one drawback with this example (sum alongside a histogram) is that a histogram data point already contains a sum, so exposing a separate metric with the sum is somehow not useful. The max function makes a better example, you could export the maximum over [1m], [10m], [1hr], and so on. (Related: open-telemetry/opentelemetry-proto#279)

The example export one histogram by metric_by_a_and_b{attributeA,attributeB}, one one histogram by metric_by_a_and_c{attributeA,attributeC}, maybe? I would emphasize that when doing this kind of output, separate metric names MUST be used to avoid metric data being recombined with itself.

I updated to use request size. I agree that histogram already has sum. I've also added some caveats to the image to denote it's meant to show the power, not a practical sceanrio.

PTAL at the new verbage. Also, I don't want to throw the baby out with the bathwater here. Look at the whole context of the doc as an introduction to the space of metrics. We need to both:

Give people a mental model of why/how Instruments differ from Metric.

Give people a grounding in what mapping from Event => Metric Stream looks like and the decisions that need to be made.

We do not need to specify the Otel API or its behavior here. We only need to specify what concepts are allowed in OTel metric streams. Lots of these nuances and details belong in the API docs.

I see these docs server two users primarily:

API authors looking to generate OTLP by mapping their Events/Instruments into streams.

Exporter authors looking to consume OTLP by mapping streams into their backend timeseries.

Instrumentation users who are generating metrics should be using the API specification.

SergeyKanzhelev

lgtm except the example on the picture. I suggest to fix it before merging

jmacd · 2021-04-19T23:35:38Z

specification/metrics/datamodel.md

+other system.  OpenTelemetry metrics are designed such that the same instrument
+and events can be used in different ways to generate metric streams.
+
+![Events → Streams](img/model-event-layer.png)


If the example was a count (e.g., request_size), then max, sum, and histogram outputs all make sense.

However, one drawback with this example (sum alongside a histogram) is that a histogram data point already contains a sum, so exposing a separate metric with the sum is somehow not useful. The max function makes a better example, you could export the maximum over [1m], [10m], [1hr], and so on. (Related: open-telemetry/opentelemetry-proto#279)

The example export one histogram by metric_by_a_and_b{attributeA,attributeB}, one one histogram by metric_by_a_and_c{attributeA,attributeC}, maybe? I would emphasize that when doing this kind of output, separate metric names MUST be used to avoid metric data being recombined with itself.

specification/metrics/datamodel.md

jmacd · 2021-04-20T00:13:55Z

specification/metrics/datamodel.md

+- Adding vs. Grouping aggregation.
+  - Adding instruments express a sum.  All points recorded via this instrument
+    are parts of a whole.
+  - Grouping instruments characterize a group of measurements.  All points


I still believe this is a useful concept, but I am worried we've lost the connection to the API design by now. There's something about how Gauge and Histogram instruments are the same semantics with different default aggregations: both group individual measurements.

The point is that Gauge and Histogram inputs are semantically different than Sum inputs, because of individuality. This is meant to help the user choose between Counter and Histogram, for example. @reyang I mean to follow up on last week's API/SDK SIG meeting, in which we discussed some of this.

specification/metrics/datamodel.md

reyang

LGTM.

SergeyKanzhelev · 2021-04-20T16:45:05Z

@jmacd do you want more changes in PR or merge as-is for now?

open-telemetry#1614) * Expand event-model description to more clearly delineate instrument vs. metric stream and some reasoning behind it. * Fix lint. * Address comments. * fix spelling error in drawing.' ' ' * Crop Image * Fix label on model layer image. * Tweak phrases of instruments and point directly at api specification. * Add a little caveat for word-lawyers * Updates from review. * Fix lint. Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>

Expand event-model description to more clearly delineate instrument v…

e2b00e5

…s. metric stream and some reasoning behind it.

jsuereth requested review from a team and victlu April 13, 2021 14:21

github-actions bot assigned SergeyKanzhelev Apr 13, 2021

jsuereth requested a review from jmacd April 13, 2021 14:21

Fix lint.

7beeda8

punya reviewed Apr 13, 2021

View reviewed changes

jsuereth added 2 commits April 13, 2021 10:57

Address comments.

a55a904

fix spelling error in drawing.'

b02c148

' '

SergeyKanzhelev reviewed Apr 17, 2021

View reviewed changes

specification/metrics/datamodel.md Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Apr 17, 2021

View reviewed changes

SergeyKanzhelev approved these changes Apr 17, 2021

View reviewed changes

jsuereth and others added 3 commits April 17, 2021 11:53

Merge remote-tracking branch 'otel/main' into wip-instruments

396b71c

Crop Image

ff221ab

Fix label on model layer image.

4974fc8

sab-regime approved these changes Apr 19, 2021

View reviewed changes

jmacd reviewed Apr 20, 2021

View reviewed changes

jsuereth added 4 commits April 20, 2021 09:49

Tweak phrases of instruments and point directly at api specification.

b162a83

Add a little caveat for word-lawyers

67d12a8

Updates from review.

ab97a94

Fix lint.

7063eef

reyang approved these changes Apr 20, 2021

View reviewed changes

bogdandrutu approved these changes Apr 20, 2021

View reviewed changes

Merge branch 'main' into wip-instruments

42772a7

jmacd merged commit d0d2b48 into open-telemetry:main Apr 22, 2021

jsuereth deleted the wip-instruments branch April 22, 2021 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand event-model description to more clearly delineate instrument v… #1614

Expand event-model description to more clearly delineate instrument v… #1614

jsuereth commented Apr 13, 2021

punya Apr 13, 2021

jsuereth Apr 13, 2021

jmacd Apr 20, 2021

jsuereth Apr 20, 2021

SergeyKanzhelev Apr 17, 2021

jsuereth Apr 19, 2021

jmacd Apr 19, 2021

jsuereth Apr 20, 2021

SergeyKanzhelev left a comment

jmacd Apr 19, 2021

jmacd Apr 20, 2021

reyang left a comment

SergeyKanzhelev commented Apr 20, 2021

Expand event-model description to more clearly delineate instrument v… #1614

Expand event-model description to more clearly delineate instrument v… #1614

Conversation

jsuereth commented Apr 13, 2021

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyang left a comment

Choose a reason for hiding this comment

SergeyKanzhelev commented Apr 20, 2021