Clarify that attribute keys are unique in collections #2248

tigrannajaryan · 2022-01-06T16:31:54Z

Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map.

There was a recent confusion about this. This change clarifies the spec. This is an editorial change. It does not change any semantics.

Resolves #2245

yurishkuro · 2022-01-06T16:46:02Z

Who is supposed to be responsible for the uniqueness? We cannot impose requirements on the users of the API who can call setAttribute twice with the same key. And streaming implementations of the API cannot enforce key uniqueness (e.g. via last-wins) because they do not hold state.

tigrannajaryan · 2022-01-06T17:53:14Z

Who is supposed to be responsible for the uniqueness? We cannot impose requirements on the users of the API who can call setAttribute twice with the same key. And streaming implementations of the API cannot enforce key uniqueness (e.g. via last-wins) because they do not hold state.

I agree, users should't be required to do that. The Trace API already enforces the behavior:

Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute's value.

My understanding of this clause is that the uniqueness is enforced (perhaps eventually, but nevertheless). Streaming implementations don't necessarily need to hold state on the sender side to enforce this. They can define the semantics of the receiving side rules in a manner that results uniqueness of attributes from receiver's perspective (the "overwrite if exists" rule can be applied at the receiver end).

I was not able to find an equivalent clause in the Metrics API. It should be probably added.

specification/common/common.md

carlosalberto · 2022-01-12T00:53:19Z

@yurishkuro Please take a look again.

yurishkuro · 2022-01-12T02:27:54Z

As I already commented, I do not understand what this description refers to. Something needs to enforce this uniqueness. It can be users of the API, the implementation of the API (i.e. SDK), or the backend. OTel cannot impose requirements on the users or the backends, and SDKs may not be able to enforce uniqueness (e.g. streaming API). So what is this clause referring to?

tigrannajaryan · 2022-01-12T20:41:58Z

As I already commented, I do not understand what this description refers to. Something needs to enforce this uniqueness. It can be users of the API, the implementation of the API (i.e. SDK), or the backend.

@yurishkuro My reading of the current requirement is that it is the responsibility of the SDK or of the exporter to enforce uniqueness.

If it is not clear we should clarify it, but I think it can be a separate PR. This PR does not contradict that, this PR merely sets the broader expectation from the semantics perspective.

yurishkuro · 2022-01-12T20:44:45Z

The current req says "SHOULD overwrite", so uniqueness is not guaranteed. And that's my concern with the wording in this PR is that it gives a misleading impression that it is guaranteed / enforced.

tigrannajaryan · 2022-01-12T20:55:29Z

The current req says "SHOULD overwrite", so uniqueness is not guaranteed. And that's my concern with the wording in this PR is that it gives a misleading impression that it is guaranteed / enforced.

That's a good point. I think the only reason it is a SHOULD and not a MUST is because an alternate strategy is possible (drop duplicates), not because duplicates are considered acceptable. I am fine with changing it to MUST or calling our more clearly the uniqueness requirement (or just linking to the wording introduced by this PR).

Again, I think this PR is more fundamental and it reflects our intent. If the existing spec wording contradicts with this PR then that wording needs to change not our guarantee of uniqueness which this PR merely states explicitly.

yurishkuro · 2022-01-12T22:09:06Z

Again, I think this PR is more fundamental and it reflects our intent.

I guess this is where we disagree. First, there is no fundamental problems for the backends (at least tracing backends) to support multiple attributes with the same key (Jaeger always allowed that), and I see it as a function of you-get-back-what-you-put-in, i.e. the users are in control of uniqueness. Second, I think there's definitely room for strict maps, but only in situations with atomic writes to the API. E.g. adding an event to span is atomic, you can't amend the event afterwards, and that gives a clear choke point to enforce uniqueness. Logging API is similar (if we had one), metrics API probably same as well. However, the Span API allows multiple writes, so as long as the user can call span.setAttribute(k, v) multiple times, they should not expect some magic deduping by default (although it can be opt-in at some point in the pipeline).

Concrete example: Canopy at Meta supports block.annotate(props) API (block == span). Props is a map so uniqueness is enforced in this atomic write, but annotate can be called multiple times, possibly with conflicting keys. The SDK implementation is streaming, i.e. these writes are flushed independently, no state is stored in the block/span. The storage accepts all writes as is, without fiddling with the data. So the only point to enforce uniqueness (if desired) is in the trace reader when higher-level data model is returned inflated from the raw events.

specification/common/common.md

jmacd · 2022-01-13T16:43:59Z

@yurishkuro I think you can have your interpretation inside a streaming SDK without changing the spec from Tigran's position.

OTel added an array-of-strings attribute so that you could have a multi-valued attribute. If the application intends to report a multi-valued attribute, it should built the values in memory and set them as an attribute. If you want to update the attribute to a new set of values through the span's lifetime, that also works, but the set of attributes remains a map and your attribute has multiple string values. The application is required to keep its working set in memory if it wants to change the list of values for an attribute, since it has to update them all in a single call.

The Metrics data model defines more than one way to model data, and there is an analogy to draw with tracing. Metrics defines an event model, wherein each call to a synchronous instrument is an independently observable event. A streaming metric SDK could report each event as an individual (hey let's use column compression for this!) and if you're so inclined you could see every change to a counter this way. The data model defines a "stream" model which eliminates the individual events and aggregates them together. You can no longer see the individual updates.

For a streaming trace SDK, I would recommend you consider the SetAttribute() as an immutable event. Considering a string-valued attribute being set multiple times, you will see one event for every modification to the attribute. When all the events are collected, you will see an attribute that was modified several times. But at each point in time, semantically, the attribute has a single value. The consumer of your streaming SDK may see a timeline of updates to the event, and you can show all the values that were set, but I would say it's off-spec to consider the set of values that were set to be a multi-valued map.

tigrannajaryan · 2022-01-13T16:49:36Z

But at each point in time, semantically, the attribute has a single value.

I think this is key observation. I intentionally avoid specifying who and how enforces these semantics because there are many different places and ways to achieve it. As long as we explicitly tell that it is a requirement how exactly it is achieved can and should be left undefined to give enough leeway for implementations to do the best they can.

I will follow up with a PR for OTLP in the proto repo, specifically requiring that senders must not have duplicates in the attributes list.

Other implementations using other protocols (e.g stream SDK with an exporter that uses an event-based protocol) can move the uniqueness enforcement requirement to the receiver/backend.

yurishkuro · 2022-01-13T18:43:56Z

I mostly agree with what @jmacd is saying. What I see problematic is that this is clearly a very nuanced topic, as evidenced by this discussion, but the spec change makes a blunt statement that pretends that these nuances do not exist. If I were reading this spec for the first time, I would still have the exact same questions I raised here, because the wording does not answer any of them.

tigrannajaryan · 2022-01-13T21:28:41Z

I mostly agree with what @jmacd is saying. What I see problematic is that this is clearly a very nuanced topic, as evidenced by this discussion, but the spec change makes a blunt statement that pretends that these nuances do not exist. If I were reading this spec for the first time, I would still have the exact same questions I raised here, because the wording does not answer any of them.

This is a great feedback. Let me do another round and see if I can add more clarity and capture these nuances.

tigrannajaryan · 2022-01-20T01:26:43Z

@yurishkuro @jmacd I added some wording to address your comments. Please have another look and see if it helps.

tigrannajaryan · 2022-01-20T01:28:07Z

CI failure doesn't seem to be related to my change in any way.

Reworded, should be OK now. Please review again.

tigrannajaryan · 2022-01-24T18:07:03Z

@jmacd PTAL.

carlosalberto · 2022-01-26T21:01:14Z

Tried to merge but it won't allow me, as there's one outdated unresolved conversation - which cannot be found now, hence cannot be resolved.

Wondering if anybody has experienced this feature before?

Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map. There was a recent confusion about this. This change clarifies the spec. Resolves open-telemetry#2245

…#2248) Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map. There was a recent confusion about this. This change clarifies the spec. Resolves open-telemetry#2245

tigrannajaryan requested review from a team January 6, 2022 16:31

github-actions bot assigned jsuereth Jan 6, 2022

tigrannajaryan force-pushed the unique-attr-keys branch from 74cae77 to 550a879 Compare January 6, 2022 17:54

yurishkuro reviewed Jan 6, 2022

View reviewed changes

specification/common/common.md Show resolved Hide resolved

tigrannajaryan force-pushed the unique-attr-keys branch from 550a879 to 52d4a88 Compare January 6, 2022 20:36

carlosalberto reviewed Jan 7, 2022

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

iNikem approved these changes Jan 7, 2022

View reviewed changes

carlosalberto approved these changes Jan 8, 2022

View reviewed changes

bogdandrutu approved these changes Jan 10, 2022

View reviewed changes

pirgeo approved these changes Jan 11, 2022

View reviewed changes

Oberon00 reviewed Jan 11, 2022

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

jmacd previously requested changes Jan 13, 2022

View reviewed changes

specification/common/common.md Outdated Show resolved Hide resolved

tigrannajaryan force-pushed the unique-attr-keys branch from 731bcd9 to 2e30032 Compare January 20, 2022 01:23

yurishkuro approved these changes Jan 20, 2022

View reviewed changes

tigrannajaryan force-pushed the unique-attr-keys branch from 2e30032 to 1343ea0 Compare January 20, 2022 01:25

tigrannajaryan requested a review from jmacd January 20, 2022 01:26

tigrannajaryan force-pushed the unique-attr-keys branch from 1343ea0 to 118d565 Compare January 24, 2022 18:05

jmacd approved these changes Jan 26, 2022

View reviewed changes

jmacd mentioned this pull request Jan 26, 2022

Span attribute drop logic does not comply with the specification. open-telemetry/opentelemetry-go#2554

Closed

tigrannajaryan added 2 commits January 26, 2022 16:05

Clarify that attribute keys are unique in collections

75615aa

Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map. There was a recent confusion about this. This change clarifies the spec. Resolves open-telemetry#2245

Expand on how the uniqueness of attributes is enforced

4413006

tigrannajaryan force-pushed the unique-attr-keys branch from f9b64a0 to 4413006 Compare January 26, 2022 21:06

tigrannajaryan merged commit abc1f97 into open-telemetry:main Jan 26, 2022

tigrannajaryan deleted the unique-attr-keys branch January 26, 2022 21:47

jtescher mentioned this pull request Nov 27, 2023

Metric SDK - remove de-duplicating attributes for perf gain open-telemetry/opentelemetry-rust#1398

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify that attribute keys are unique in collections #2248

Clarify that attribute keys are unique in collections #2248

tigrannajaryan commented Jan 6, 2022

yurishkuro commented Jan 6, 2022

tigrannajaryan commented Jan 6, 2022

carlosalberto commented Jan 12, 2022

yurishkuro commented Jan 12, 2022

tigrannajaryan commented Jan 12, 2022

yurishkuro commented Jan 12, 2022

tigrannajaryan commented Jan 12, 2022

yurishkuro commented Jan 12, 2022 •

edited

Loading

jmacd commented Jan 13, 2022

tigrannajaryan commented Jan 13, 2022

yurishkuro commented Jan 13, 2022

tigrannajaryan commented Jan 13, 2022

tigrannajaryan commented Jan 20, 2022

tigrannajaryan commented Jan 20, 2022

tigrannajaryan commented Jan 24, 2022

carlosalberto commented Jan 26, 2022

Clarify that attribute keys are unique in collections #2248

Clarify that attribute keys are unique in collections #2248

Conversation

tigrannajaryan commented Jan 6, 2022

yurishkuro commented Jan 6, 2022

tigrannajaryan commented Jan 6, 2022

carlosalberto commented Jan 12, 2022

yurishkuro commented Jan 12, 2022

tigrannajaryan commented Jan 12, 2022

yurishkuro commented Jan 12, 2022

tigrannajaryan commented Jan 12, 2022

yurishkuro commented Jan 12, 2022 • edited Loading

jmacd commented Jan 13, 2022

tigrannajaryan commented Jan 13, 2022

yurishkuro commented Jan 13, 2022

tigrannajaryan commented Jan 13, 2022

tigrannajaryan commented Jan 20, 2022

tigrannajaryan commented Jan 20, 2022

tigrannajaryan commented Jan 24, 2022

carlosalberto commented Jan 26, 2022

yurishkuro commented Jan 12, 2022 •

edited

Loading