Define how providers handle identical creation calls #4146

pellared · 2024-07-15T09:39:43Z

Towards open-telemetry/opentelemetry-go#3368

Why why

Most of the SDKs ignore instrumentation scope attributes (C++ and PHP are exceptions). This is a functional blocker for following issues:

[Logs] Reporting logger name, log bridge name and version semantic-conventions#1550
Semantic convention for name and version of the instrumented library semantic-conventions#1229
Define standard, vendor-agnostic, language-agnostic instrumentation identity attributes semantic-conventions#1094

Why

The specification does not explicitly stay how the SDK should handle tracer/meter/logger creation when the user calls it with the same identifying parameters (name, version, scheme_url), but when other parameters are different (currently only attributes).

Because of this the Go SDK ignores instrument scope attributes (see: open-telemetry/opentelemetry-go#3368).
I think the same is true for Java SDK, .NET SDK, Python SDK, JS SDK.

C++ SDK creates only a single tracer for many "identical" creation calls (the instrumentation scope attributes value comes from the first "identical" tracer creation call).

PHP SDK creates always a new instance of a tracer for many "identical" creation calls (with potentially different instrumentation scope attributes).

Right now, the specification does not require the SDK to handle scope attributes. Currently, having them ignored is specification compliant.

What

SDK MUST emit instrumentation scope attributes that were passed during first "identical" instrumentation creation (which is a correct way of using the API)
SDK handles next "identical" instrumentation creations in the way they want (e.g. return always a new instrument instance like in PHP, JS or return the previously created instrument like in C++, Go). However, it MUST be documented that this behavior may change in future releases (which is currently seen as incorrect usage of the API - still, in future we may see it as a valid usage of the API).

I think that we do NOT need to emit warnings when user creates instruments with the same identifying parameters (name, version, scheme_url), but with different attributes until we feedback that such issue occurs frequently enough.

The most important functional addition to SDK spec is

The first identical INSTRUMENT MUST be associated with all (both identifying and non-identifying) passed parameters.

…dled

reyang · 2024-07-15T19:59:21Z

@pellared could you update the compliance matrix? https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md

pellared · 2024-07-15T20:10:53Z

@reyang, @jmacd, can you double-check the changelog changes?

could you update the compliance matrix?

I see that there are entries for it.

Get a Tracer with scope attributes
get_meter accepts attributes.
LoggerProvider.Get Logger accepts attributes

I see that according to the compliance matrix PHP and C++ SDKs are supposed to support this scenario.

@open-telemetry/cpp-maintainers, @open-telemetry/php-maintainers, maybe you can take a look if my proposal is complaint with your SDKs existing behavior?

EDIT:
I think that C++ SDK would be compliant with my proposal and PHP SDK would NOT.

yurishkuro · 2024-07-15T20:59:21Z

when the user calls it with the same identifying parameters (name, version, scheme_url), but when other parameters are different (currently only attributes).

Sorry, I am not following the premise of the problem. The API for getting the instrument does not accept attributes, only the scope & schema. What other attributes are you referring to? And why is it necessary to require that the returned object is "the same" - it sounds like a non-functional, performance-related requirement but it's theoretically possible to provide identical performance even by returning different objects (e.g. unique object per thread).

pellared · 2024-07-16T05:11:45Z

The API for getting the instrument does not accept attributes, only the scope & schema

They do accept instrument scope attributes.

attributes (optional): Specifies the instrumentation scope attributes to associate with emitted telemetry. This API MUST be structured to accept a variable number of attributes, including none.

why is it necessary to require that the returned object is "the same"

Consistent behavior for handling parameters which are described as "identifying" the instruments.

The term identical applied to Tracer/Meter/Loggers describes instances where all
identifying fields are equal. The term distinct applied to Tracer/Meter/Loggers describes
instances where at least one identifying field has a different value.

pellared · 2024-07-16T16:20:11Z

SIG meeting notes:

There is a confusion why instrumentation scope attributes cannot be part of "identifying parameters ".
It should be a SHOULD (not MUST). It think SDKs should document that this is a not supported use case and behavior may change. Also it once mentioned that some SDKs may want to partition the identity to e.g. threads or do other things to reduce the synchronization.
(separate PR) The SDK SHOULD warn when the user calls with the same identifying parameters (name, version, scheme_url), but when other parameters are different. The JS would not like it, because they do not want to "track" the created instruments. They always create a new one.

@tigrannajaryan, do you recall some previous conversations and decisions why instrumentation scope attributes cannot be part of "identifying parameters "?

pellared · 2024-07-16T16:29:28Z

After the SIG meeting, I think it would be better to make smaller steps to make some progress. My proposal is that:

SDK MUST emit instrumentation scope attributes that were passed during first "identical" instrumentation creation (which is a correct way of using the API)
SDK handles next "identical" instrumentation creations in the way they want (return new unique instance, return the previously created instrument). However, it MUST be documentee that this behavior may change in future releases (which is currently seen as incorrect usage of the API - still, in future we may see it as a valid usage of the API).

I think that we do NOT need to emit warnings when user creates instruments with the same identifying parameters (name, version, scheme_url), but with different attributes until we feedback that such issue occurs frequently enough.

Does it look reasonable?

CC @jack-berg, @jmacd, @reyang, @jsuereth

pellared · 2024-07-17T10:16:45Z

I changed the PR (together with its description) based on feedback from the SIG meeting.

specification/logs/sdk.md

specification/metrics/sdk.md

specification/trace/sdk.md

jmacd · 2024-07-23T15:44:27Z

Picking up a fragment of an argument that was presented around this topic in last week's OTel Spec SIG:

I had stated a desire to see Scope attributes used, in particular I expect Scope attributes to reduce the cost of repetitive attributes. Where we expect to see the same attribute on a span/metric/log over and over again, we can optimize the cost of telemetry by placing attributes into the scope.

The counter-argument was that it would cause trouble for Prometheus? What should Prometheus do with scope attributes?, goes the question. My opinion is that scope attributes should be elevated to metric attributes, since their purpose is to lower the cost of telemetry. A Prometheus user with an instrumentation library having several instruments may observe themselves repeatedly applying the same attributes to every instrument. In this situation, the Prometheus user can lower the cost of recording metrics and the cost of OTLP transport by adopting a scope attribute instead of repeated metric attributes. If you have an attribute like A=B applied to every metric instrument in an instrumentation library, it would be semantically no change to replace the repeated attribute with the same attribute on the scope.

Working backwards from this position, it seems that SDKs simply need to record scope attributes--whether or not they recognize "identical" calls or not.

tsloughter · 2024-07-24T09:25:47Z

Bumping what @pellared asked in an earlier comment, does anyone know why attributes weren't included as identifying if its declared that:

It is a user error to create Tracers with different attributes but the same identity.

That is pretty contradictory, "they are non-identifying, but if you use them as non-identifying it is your fault for doing it wrong and we won't tell you what will happen" :)

tsloughter · 2024-07-24T09:32:15Z

Also wanted to give a bigger 👍 than just on @jmacd's comment to scope attributes. I get asked about that a good bit for Erlang/Elixir.

Just realized we meant different scopes, though I get asked about both. "Scope" meaning both Instrumentation Scope attributes and the current context having attributes that get attached -- the latter meaning attributes would go on signal data generated by any Instrumentation Scope that is created using the same context.

pellared · 2024-07-24T10:12:16Z

why attributes weren't included as identifying

AFAIK because scope attributes were introduced after instrumentation scope name, version, and scheme URL were specified as identifying. As far as I understood from the SIG meeting, adding a new parameter to the identifying parameters list was considered as a breaking change.

Personally, I do not find it as a breaking change. At least from my database experience adding a column to a primary key it is an acceptable change/migration.

However, in order to include instrumentation scope attributes to the identifying parameters list we would need to define the equality of attribute collections (I do not think it is specified). In my opinion, the order or attributes should not affect the equality of attribute ([k1: v1, k2: v2] should be equal to [k2: v2, k1: v1]) (especially that some languages model the attributes as map/hashset).

Still, I was thinking of proposing a "baby-step" first that will not be against the proposal of adding instrumentation scope attributes as identifying later.

CC @tigrannajaryan

yurishkuro · 2024-07-24T17:31:15Z

I would prefer to split the discussion away from the concept of "identifying". If a user calls provider.Get("foo", "1.0", {"bar": "baz"}) they expect the attributes to be captured in the telemetry. It has nothing to do with "identifying". If SDK has some magical way of returning the same handle and yet still capturing user data (scope attributes), they can roll with it. They can also return different handles for identical inputs if they have reasons to do so (perhaps the buffering associated with the original handle has expired and was deallocated). I don't see why the spec needs to dictate the behavior in terms of "identity" of the handles, it should only describe the resulting behavior: the user-provided data must be preserved.

Now, there is a special case here for metrics as some of these implementation choices might affect Prometheus aggregations, or the provider could be configured to not include scope attributes in the metrics (technically this is also possible with other signals). But again, it's up to the SDK to decide which handle to return. If it already knows from its internal configuration that scope attributes will not make it to the exported metrics, it can return the same handle and ignore the scope attributes.

pellared · 2024-07-24T21:43:17Z

@yurishkuro, anything needed to be changed in the PR? I put the "identity" wording as it already exists in the API specs and I tried my best to capture what you described. I am open to suggestions. So far I see no actionable feedback based on comments. I think that some of the feedback is not directly related to proposed changes but the way how "get instrument APIs" are specified.

yurishkuro · 2024-07-24T22:49:53Z

@pellared specifically, I would remove, instead of add more, references to "identifying". I think this concept is irrelevant and only creates more confusion, trying to dictate implementation instead of specifying behavior relevant to the user.

pellared · 2024-07-25T07:58:15Z

I would remove, instead of add more, references to "identifying". I think this concept is irrelevant and only creates more confusion, trying to dictate implementation instead of specifying behavior relevant to the user.

I want to know that the "identifying fields" term is (currently) relevant in duplicate instrument registration section.

In order, to remove "identifying fields", we would need to define equality for Attributes Collection first (so that duplicate instrument registration happens when all passed input is equal).

For now, I am closing this PR as I think the issue needs more discussions. I also think that resolving the issue may involve having more then one PR. I have created #4160 to have discussions in an issue rather than a PR.

pellared added 2 commits July 15, 2024 11:13

Clarify in API spec that instrumentation scope attributes MUST be han…

a76d952

…dled

Define how SDKs should handle identifacal creation calls

7086186

pellared changed the title ~~Scope attrs~~ Define how providers handle identical creation calls Jul 15, 2024

pellared added 3 commits July 15, 2024 11:42

Update changelog

289eea1

Remove leftover

df9b94f

Refine description

f349620

jmacd approved these changes Jul 15, 2024

View reviewed changes

pellared added 2 commits July 15, 2024 21:34

Use underscore for emphasis

6425b5e

Fix MD049

8f55301

pellared marked this pull request as ready for review July 15, 2024 19:51

pellared requested review from a team July 15, 2024 19:51

github-actions bot assigned arminru Jul 15, 2024

reyang approved these changes Jul 15, 2024

View reviewed changes

Update CHANGELOG.md

867ee02

Merge branch 'main' into scope-attrs

3a66a11

pellared mentioned this pull request Jul 15, 2024

Add support for instrumentation scope attributes during meter/tracer/logger creation open-telemetry/opentelemetry-go#3368

Closed

Merge branch 'main' into scope-attrs

2f0f41d

pellared marked this pull request as draft July 16, 2024 16:30

pellared added 2 commits July 17, 2024 11:40

Merge branch 'main' into scope-attrs

df252e0

Loosen the requirements

275fcac

pellared added 2 commits July 17, 2024 12:09

Remove EOL

7943abc

Remove not needed spaces

8a12ad9

Fix MD049/emphasis-style Emphasis style should be consistent

81ac51c

pellared marked this pull request as ready for review July 17, 2024 10:21

github-actions bot assigned yurishkuro Jul 17, 2024

pellared requested review from jmacd and reyang July 17, 2024 10:29

This was referenced Nov 6, 2024

[Logs] Reporting logger name, log bridge name and version open-telemetry/semantic-conventions#1550

Open

Define standard, vendor-agnostic, language-agnostic instrumentation identity attributes open-telemetry/semantic-conventions#1094

Closed

Merge branch 'main' into scope-attrs

9300142

pellared commented Jul 23, 2024

View reviewed changes

specification/logs/sdk.md Outdated Show resolved Hide resolved

specification/metrics/sdk.md Outdated Show resolved Hide resolved

specification/trace/sdk.md Outdated Show resolved Hide resolved

pellared added 3 commits July 23, 2024 10:02

Apply suggestions from code review

a27c611

Update sdk.md

64d9ee3

format

921abb7

Merge branch 'main' into scope-attrs

1d3e5c1

pellared marked this pull request as draft July 25, 2024 08:00

pellared mentioned this pull request Jul 25, 2024

Define how providers handle instrumentation scope attributes and identical creation calls #4160

Closed

pellared closed this Jul 25, 2024

pellared mentioned this pull request Jul 25, 2024

Define attributes equality and make all fields as identifying for Tracer, Meter, Logger, EventLogger #4161

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define how providers handle identical creation calls #4146

Define how providers handle identical creation calls #4146

pellared commented Jul 15, 2024 •

edited

Loading

reyang commented Jul 15, 2024

pellared commented Jul 15, 2024 •

edited

Loading

yurishkuro commented Jul 15, 2024

pellared commented Jul 16, 2024 •

edited

Loading

pellared commented Jul 16, 2024

pellared commented Jul 16, 2024 •

edited

Loading

pellared commented Jul 17, 2024

jmacd commented Jul 23, 2024

tsloughter commented Jul 24, 2024

tsloughter commented Jul 24, 2024

pellared commented Jul 24, 2024 •

edited

Loading

yurishkuro commented Jul 24, 2024 •

edited

Loading

pellared commented Jul 24, 2024

yurishkuro commented Jul 24, 2024 •

edited

Loading

pellared commented Jul 25, 2024 •

edited

Loading

Define how providers handle identical creation calls #4146

Define how providers handle identical creation calls #4146

Conversation

pellared commented Jul 15, 2024 • edited Loading

Why why

Why

What

reyang commented Jul 15, 2024

pellared commented Jul 15, 2024 • edited Loading

yurishkuro commented Jul 15, 2024

pellared commented Jul 16, 2024 • edited Loading

pellared commented Jul 16, 2024

pellared commented Jul 16, 2024 • edited Loading

pellared commented Jul 17, 2024

jmacd commented Jul 23, 2024

tsloughter commented Jul 24, 2024

tsloughter commented Jul 24, 2024

pellared commented Jul 24, 2024 • edited Loading

yurishkuro commented Jul 24, 2024 • edited Loading

pellared commented Jul 24, 2024

yurishkuro commented Jul 24, 2024 • edited Loading

pellared commented Jul 25, 2024 • edited Loading

pellared commented Jul 15, 2024 •

edited

Loading

pellared commented Jul 15, 2024 •

edited

Loading

pellared commented Jul 16, 2024 •

edited

Loading

pellared commented Jul 16, 2024 •

edited

Loading

pellared commented Jul 24, 2024 •

edited

Loading

yurishkuro commented Jul 24, 2024 •

edited

Loading

yurishkuro commented Jul 24, 2024 •

edited

Loading

pellared commented Jul 25, 2024 •

edited

Loading