Allow all metric conventions to be either synchronous or asynchronous. #2458

fstab · 2022-03-30T09:01:25Z

Changes

It should be an implementation detail whether metric instruments are synchronous or asynchronous.

As an example, for many Java runtime metrics there will be two ways to implement them: Using JFR events, or using JMX. JFR-based implementations will likely be synchronous, while JMX-based implementations will likely be asynchronous. The semantic conventions spec should be defined in a way to allow compatible implementations both ways.

This is generally true, not just for Java runtime metrics.

This PR removes "synchronous" and "asynchronous" from the specification of semantic conventions for metrics.

jmacd · 2022-03-30T17:36:12Z

I would suggest that whether metric instruments are synchronous or asynchronous is always implementation detail. You should be able to switch between these instrument forms for performance reasons, and temporality controls are available for exporters to control what the consumer receives.

Moreover, for the OTLP exporter specifically, the supported temporality preferences will give consistent temporality output, even when the underlying implementation changes between synchronous and asynchronous instruments.

trask · 2022-03-30T18:20:38Z

@fstab I sent fstab#1 to your PR to expand this to all metric conventions (and consolidate #2451)

fstab · 2022-03-30T18:46:45Z

Thanks Trask, I merged it, and rebased to the latest origin/main.

trask · 2022-03-30T19:15:10Z

@fstab can you update the PR title and description to reflect the expanded scope?

bogdandrutu

I would like to clarify the "semantic convention" rational to use the "instrument" type instead of the Aggregation type in the table. For OTLP consumers it is irrelevant the instrument type (since someone can use a view to change this), so what is the correct way to define the "semantic convention"? Who is the user/consumer of these? Etc.

fstab · 2022-03-31T14:30:58Z

@trask I updated the title and description, and I rebased on main again. Should I squash the commits or will this be done during merge?

jmacd · 2022-03-31T15:30:08Z

I would like to clarify the "semantic convention" rational to use the "instrument" type instead of the Aggregation type in the table. For OTLP consumers it is irrelevant the instrument type (since someone can use a view to change this), so what is the correct way to define the "semantic convention"? Who is the user/consumer of these? Etc.

Opinions here: The "user" is the one who decides to link in a standards-conventional library of code; the standards conventions tell them what kind of view they should expect to be able to write. The Aggregation is user-controllable from that point onward, with the conventional instrument telling you which Aggregation to expect by default (but not which temporality).

I'm so pleased that this issue has come to the head, because the point of temporality is that it is non-semantic in nature. In other words--its a performance and cost trade. I loved hearing this:

JFR-based implementations will likely be synchronous, while JMX-based implementations will likely be asynchronous

Sweet!!

trask · 2022-03-31T21:29:26Z

@trask I updated the title and description, and I rebased on main again. Should I squash the commits or will this be done during merge?

@fstab thx! it will be squashed during merge

I would like to clarify the "semantic convention" rational to use the "instrument" type instead of the Aggregation type in the table. For OTLP consumers it is irrelevant the instrument type (since someone can use a view to change this), so what is the correct way to define the "semantic convention"? Who is the user/consumer of these? Etc.

My basic view (biased as an instrumenter I'm sure) is that the semantic conventions are a set of requirements for compliant instrumentation. This also then defines guarantees for what OTLP consumers / backends can expect from compliant instrumentation under default views / span processors / log processors.

If we want to make a stronger statement about metric aggregation type, we could list both instrument type and aggregation type in the semantic conventions to make it clear that if a user alters the aggregation type of a "standard" metric, it will no longer be considered a "standard" metric.

fstab · 2022-03-31T21:52:18Z

I would like to clarify the "semantic convention" rational to use the "instrument" type instead of the Aggregation type in the table. For OTLP consumers it is irrelevant the instrument type (since someone can use a view to change this), so what is the correct way to define the "semantic convention"? Who is the user/consumer of these? Etc.

This is a good point.

I'm still struggling with the terminology.

We are currently using Instrument as defined in metrics/api.md. Unfortunately there is no terminology to express "a Counter that can either be synchronous or asynchronous", because according to the spec the term Counter is by definition synchronous. We are trying to use Counter without requiring it to be synchronous, but that's not the terminology defined in metrics/api.md.

What we could do is use Point Kind from metrics/datamodel.md instead of Instrument:

Counter -> Monotonic Sum
UpDownCounter -> Gauge
Histogram -> Histogram

The Point Kind has the advantage that it covers both, the synchronous and the asynchronous Instrument.

What do you think?

reyang · 2022-04-01T02:59:13Z

I'm still struggling with the terminology.

@fstab these are the available Aggregation Types (essentially what the SDK could potentially produce):

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#aggregation

Data Model is a super set of what the SDK could produce because it contains extra stuff (e.g. Summary) for compat/interop with other systems (e.g. Prometheus).

Hope this helps!

fstab · 2022-04-01T08:37:38Z

Thanks a lot @reyang.

My understanding is if we specify the aggregation (left column) in the semantic conventions, then implementations could choose any of the compatible instrument types (right column).

Aggregation	Compatible Instrument Types
Monotonic Sum	Counter
	Asynchronous Counter
Non-Monotonic Sum	UpDownCounter
	Asynchronous UpDownCounter
	Asynchronous Gauge
Explicit Bucket Histogram	Histogram

Sounds like for monitoring backends it should be fine to specify aggregations only, because that defines what monitoring backends can expect to be sent over the wire (except for temporality).

For implementers that should be fine too, because they are free to choose any of the compatible Instrument types, as the instrument type would be an implementation detail.

I'm happy to replace Instrument with Aggregation in the semantic conventions.

What do you think?

jmacd · 2022-04-01T16:58:17Z

Correction: @fstab, Asynchronous Gauge instruments are not compatible with Non-monotonic Sum--that's a semantic change. Otherwise, I support the direction you're heading.

Unfortunately there is no terminology to express "a Counter that can either be synchronous or asynchronous", because according to the spec the term Counter is by definition synchronous.

I agree this needs to be fixed. One solution would be e.g. to list "Counter or Asynchronous Counter" everywhere. Maybe "(Asynchronous) Counter" conveys this without new terminology? We might want way to refer to instruments and semantic-equivalents. Counters and Asynchronous Counters are semantic equivalents, and same for UpDownCounter and Asynchronous UpDownCounter. Otherwise, there are no other semantic equivalencies between instruments.

Instrument	Semantically-equivalent instruments
Counter	Asynchronous Counter
UpDownCounter	Asynchronous UpDownCounter
Histogram
Asynchronous Counter	Counter
Asynchronous UpDownCounter	UpDownCounter
Asynchronous Gauge

Would a general guideline that "semantic conventions are written to use "Counter" and "UpDownCounter" but for each of these the asynchronous equivalent may be used" work?

fstab · 2022-04-05T11:30:03Z

I fixed the Markdownlint error by adding a \n to the end of README.md.

dyladan · 2022-04-05T16:05:41Z

First, to get it out of the way I support this PR as JS maintainer. I think this allows quite a bit more freedom for instrumentation authors and different libraries/runtimes may expose metrics in different ways so allowing this flexibility seems like a good idea for the long haul.

Now I'd like to address the disagreement I see here. It seems like there is a fundamental disagreement here on who the semantic conventions are for. Is it meant to describe what the instrumentation authors should do, or is it meant to describe what the backend processors should expect to see? This question hasn't been that important until now because in tracing there was no mechanism like views which could cause those two to be in disagreement. Now with views, it is possible for the user to configure metrics such that they disagree with the semantic conventions. I think this is a question that we need to resolve and clearly document, because without it we are likely to see these disagreements again in the future, and even worse than that a backend may incorrectly process or display data based on the semantic conventions in particular configurations. I think it can be handled outside this PR, but I thought it was important to address.

jmacd · 2022-04-05T16:18:29Z

semantic conventions are useless for backends

Are you concerned about a potential change of aggregation or change of temporality?

The SDK has temporality controls so I think this is not the concern you're having.

If the concern is about change of aggregation, isn't that the whole point of a Views feature in our SDKs? Viewing a Histogram instrument using the SUM, COUNT, MIN, or MAX aggregation, for example--the user does this on purpose, so I don't see a problem.

jmacd · 2022-04-05T16:44:51Z

meant to describe what the instrumentation authors should do, or is it meant to describe what the backend processors should expect to see?

The way I understand it, the semantic conventions are meant to indicate 1 of 4 semantic categories.

Any aggregation that is semantically compatible with the instrument category is meaningful. I believe our job as the authors of semantic conventions and a data model is to define what is meaningful, not what is useful. A simple backend that faithfully displays each kind of aggregation can make a useful presentation out of any data, which should convey meaning to a user provided that the aggregation was semantically compatible with the instrument dictated by semantic conventions. The user controls the View, and the user receives the intended meaning for whatever use they have. Does this answer your concerns @dyladan and @bogdandrutu?

dyladan · 2022-04-05T16:50:12Z

Does this answer your concerns @dyladan and @bogdandrutu?

I think we are in agreement, but @bogdandrutu's concern seems different than what mine was. It seems to me that @bogdandrutu feels that the semconv should be able to be interpreted exactly by a backend to decide what data to display and how to display it. You seem to feel that the semconv describes what the data means, but not necessarily what its exact representation needs to be and requires the backend to have some flexibility. FWIW I think you and I are interpreting the meaning of semantic conventions the same way. My point is simply that this needs to be agreed on by everyone or we're going to see these arguments again.

reyang · 2022-04-07T19:00:36Z

Does this answer your concerns @dyladan and @bogdandrutu?

I think we are in agreement, but @bogdandrutu's concern seems different than what mine was. It seems to me that @bogdandrutu feels that the semconv should be able to be interpreted exactly by a backend to decide what data to display and how to display it. You seem to feel that the semconv describes what the data means, but not necessarily what its exact representation needs to be and requires the backend to have some flexibility. FWIW I think you and I are interpreting the meaning of semantic conventions the same way. My point is simply that this needs to be agreed on by everyone or we're going to see these arguments again.

+1, I've created #2475 to track this issue and will raise it to the semantic convention SIG. This is a
wider issue which doesn't have to block this PR.

@fstab there is one CI job failing? Would you fix it? Thanks!

Signed-off-by: Fabian Stäber <fabian@fstab.de>

fstab · 2022-04-07T20:23:09Z

Fixed the make table-check error and rebased to main.

jamesmoessis

I support this and agree that sync vs async instruments are an implementation detail and the semantic convention doesn't need to specify it. One example that comes to mind is http.server.active_requests. This was listed as async in the semantic convention, but I can imagine many instrumentations use a synchronous instrument to record it depending on implementation.

reyang · 2022-04-08T17:14:31Z

There is a transient failure which was caused by the web.archive.org maintenance ...

@fstab I think you've done your part, let's wait and I'll trigger the CI again once web.archive.org has recovered.

fstab requested review from a team March 30, 2022 09:01

github-actions bot assigned arminru Mar 30, 2022

fstab force-pushed the jvm-metrics-can-be-synchronous branch from 922a501 to 9be3f8e Compare March 30, 2022 09:03

fstab mentioned this pull request Mar 30, 2022

Change http.server.active_requests from asynchronous to synchronous UpDownCounter #2451

Closed

reyang added spec:metrics Related to the specification/metrics directory area:semantic-conventions Related to semantic conventions labels Mar 30, 2022

jmacd approved these changes Mar 30, 2022

View reviewed changes

fstab force-pushed the jvm-metrics-can-be-synchronous branch 2 times, most recently from 4faebf4 to c28ba84 Compare March 30, 2022 18:57

trask approved these changes Mar 30, 2022

View reviewed changes

jack-berg approved these changes Mar 30, 2022

View reviewed changes

bogdandrutu previously requested changes Mar 30, 2022

View reviewed changes

fstab force-pushed the jvm-metrics-can-be-synchronous branch from 67eae26 to 5727162 Compare March 31, 2022 14:25

fstab changed the title ~~Don't specify whether JVM metrics are synchronous or asynchronous~~ Allow all metric conventions to be either synchronous or asynchronous. Mar 31, 2022

reyang approved these changes Mar 31, 2022

View reviewed changes

fstab force-pushed the jvm-metrics-can-be-synchronous branch 2 times, most recently from b82d948 to 87f4e80 Compare April 1, 2022 21:08

fstab force-pushed the jvm-metrics-can-be-synchronous branch from b59993f to 948906e Compare April 5, 2022 11:28

dyladan approved these changes Apr 5, 2022

View reviewed changes

reyang mentioned this pull request Apr 7, 2022

Decide on how we organize and interpret semconv #2475

Open

fstab and others added 7 commits April 7, 2022 22:13

Allow JVM metrics to be synchronous

9923b9e

Signed-off-by: Fabian Stäber <fabian@fstab.de>

Expand to all metric conventions

2cb0229

Remove asynchronous from system-metrics

69dd080

Signed-off-by: Fabian Stäber <fabian@fstab.de>

Reformat tables

ad73df1

Signed-off-by: Fabian Stäber <fabian@fstab.de>

Explain meaning of "instrument type"

fbafe09

Signed-off-by: Fabian Stäber <fabian@fstab.de>

Fix markdownlint error

df64aaf

Signed-off-by: Fabian Stäber <fabian@fstab.de>

Fix markdownlint error

8b5d6a5

Signed-off-by: Fabian Stäber <fabian@fstab.de>

fstab force-pushed the jvm-metrics-can-be-synchronous branch from af98fae to 8b5d6a5 Compare April 7, 2022 20:22

trask mentioned this pull request Apr 7, 2022

"New contributor"-friendly setting for running workflows open-telemetry/community#1023

Merged

jamesmoessis approved these changes Apr 8, 2022

View reviewed changes

Merge branch 'main' into jvm-metrics-can-be-synchronous

28b1e63

reyang closed this Apr 8, 2022

reyang reopened this Apr 8, 2022

Merge branch 'main' into jvm-metrics-can-be-synchronous

c1f242b

reyang merged commit 7e35bec into open-telemetry:main Apr 8, 2022

jamesmoessis mentioned this pull request Apr 14, 2022

Metrics table generation from yaml open-telemetry/build-tools#79

Merged

trask mentioned this pull request Apr 17, 2022

Database connection pool metrics semantic conventions #2273

Merged

jmacd mentioned this pull request Apr 18, 2022

Initial addition of Kafka metrics. #2485

Merged

reyang mentioned this pull request May 6, 2022

Add redis specification #2525

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow all metric conventions to be either synchronous or asynchronous. #2458

Allow all metric conventions to be either synchronous or asynchronous. #2458

fstab commented Mar 30, 2022 •

edited

Loading

jmacd commented Mar 30, 2022

trask commented Mar 30, 2022

fstab commented Mar 30, 2022 •

edited

Loading

trask commented Mar 30, 2022

bogdandrutu left a comment

fstab commented Mar 31, 2022

jmacd commented Mar 31, 2022

trask commented Mar 31, 2022

fstab commented Mar 31, 2022 •

edited

Loading

reyang commented Apr 1, 2022

fstab commented Apr 1, 2022 •

edited

Loading

jmacd commented Apr 1, 2022

fstab commented Apr 5, 2022

dyladan commented Apr 5, 2022

jmacd commented Apr 5, 2022

jmacd commented Apr 5, 2022

dyladan commented Apr 5, 2022

reyang commented Apr 7, 2022

fstab commented Apr 7, 2022

jamesmoessis left a comment

reyang commented Apr 8, 2022

Allow all metric conventions to be either synchronous or asynchronous. #2458

Allow all metric conventions to be either synchronous or asynchronous. #2458

Conversation

fstab commented Mar 30, 2022 • edited Loading

Changes

jmacd commented Mar 30, 2022

trask commented Mar 30, 2022

fstab commented Mar 30, 2022 • edited Loading

trask commented Mar 30, 2022

bogdandrutu left a comment

Choose a reason for hiding this comment

fstab commented Mar 31, 2022

jmacd commented Mar 31, 2022

trask commented Mar 31, 2022

fstab commented Mar 31, 2022 • edited Loading

reyang commented Apr 1, 2022

fstab commented Apr 1, 2022 • edited Loading

jmacd commented Apr 1, 2022

fstab commented Apr 5, 2022

dyladan commented Apr 5, 2022

jmacd commented Apr 5, 2022

jmacd commented Apr 5, 2022

dyladan commented Apr 5, 2022

reyang commented Apr 7, 2022

fstab commented Apr 7, 2022

jamesmoessis left a comment

Choose a reason for hiding this comment

reyang commented Apr 8, 2022

fstab commented Mar 30, 2022 •

edited

Loading

fstab commented Mar 30, 2022 •

edited

Loading

fstab commented Mar 31, 2022 •

edited

Loading

fstab commented Apr 1, 2022 •

edited

Loading