-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify that attribute keys are unique in collections #2248
Clarify that attribute keys are unique in collections #2248
Conversation
Who is supposed to be responsible for the uniqueness? We cannot impose requirements on the users of the API who can call setAttribute twice with the same key. And streaming implementations of the API cannot enforce key uniqueness (e.g. via last-wins) because they do not hold state. |
I agree, users should't be required to do that. The Trace API already enforces the behavior:
My understanding of this clause is that the uniqueness is enforced (perhaps eventually, but nevertheless). Streaming implementations don't necessarily need to hold state on the sender side to enforce this. They can define the semantics of the receiving side rules in a manner that results uniqueness of attributes from receiver's perspective (the "overwrite if exists" rule can be applied at the receiver end). I was not able to find an equivalent clause in the Metrics API. It should be probably added. |
74cae77
to
550a879
Compare
550a879
to
52d4a88
Compare
@yurishkuro Please take a look again. |
As I already commented, I do not understand what this description refers to. Something needs to enforce this uniqueness. It can be users of the API, the implementation of the API (i.e. SDK), or the backend. OTel cannot impose requirements on the users or the backends, and SDKs may not be able to enforce uniqueness (e.g. streaming API). So what is this clause referring to? |
@yurishkuro My reading of the current requirement is that it is the responsibility of the SDK or of the exporter to enforce uniqueness. If it is not clear we should clarify it, but I think it can be a separate PR. This PR does not contradict that, this PR merely sets the broader expectation from the semantics perspective. |
The current req says "SHOULD overwrite", so uniqueness is not guaranteed. And that's my concern with the wording in this PR is that it gives a misleading impression that it is guaranteed / enforced. |
That's a good point. I think the only reason it is a SHOULD and not a MUST is because an alternate strategy is possible (drop duplicates), not because duplicates are considered acceptable. I am fine with changing it to MUST or calling our more clearly the uniqueness requirement (or just linking to the wording introduced by this PR). Again, I think this PR is more fundamental and it reflects our intent. If the existing spec wording contradicts with this PR then that wording needs to change not our guarantee of uniqueness which this PR merely states explicitly. |
I guess this is where we disagree. First, there is no fundamental problems for the backends (at least tracing backends) to support multiple attributes with the same key (Jaeger always allowed that), and I see it as a function of you-get-back-what-you-put-in, i.e. the users are in control of uniqueness. Second, I think there's definitely room for strict maps, but only in situations with atomic writes to the API. E.g. adding an event to span is atomic, you can't amend the event afterwards, and that gives a clear choke point to enforce uniqueness. Logging API is similar (if we had one), metrics API probably same as well. However, the Span API allows multiple writes, so as long as the user can call Concrete example: Canopy at Meta supports |
@yurishkuro I think you can have your interpretation inside a streaming SDK without changing the spec from Tigran's position. OTel added an array-of-strings attribute so that you could have a multi-valued attribute. If the application intends to report a multi-valued attribute, it should built the values in memory and set them as an attribute. If you want to update the attribute to a new set of values through the span's lifetime, that also works, but the set of attributes remains a map and your attribute has multiple string values. The application is required to keep its working set in memory if it wants to change the list of values for an attribute, since it has to update them all in a single call. The Metrics data model defines more than one way to model data, and there is an analogy to draw with tracing. Metrics defines an event model, wherein each call to a synchronous instrument is an independently observable event. A streaming metric SDK could report each event as an individual (hey let's use column compression for this!) and if you're so inclined you could see every change to a counter this way. The data model defines a "stream" model which eliminates the individual events and aggregates them together. You can no longer see the individual updates. For a streaming trace SDK, I would recommend you consider the |
I think this is key observation. I intentionally avoid specifying who and how enforces these semantics because there are many different places and ways to achieve it. As long as we explicitly tell that it is a requirement how exactly it is achieved can and should be left undefined to give enough leeway for implementations to do the best they can. I will follow up with a PR for OTLP in the proto repo, specifically requiring that senders must not have duplicates in the attributes list. Other implementations using other protocols (e.g stream SDK with an exporter that uses an event-based protocol) can move the uniqueness enforcement requirement to the receiver/backend. |
I mostly agree with what @jmacd is saying. What I see problematic is that this is clearly a very nuanced topic, as evidenced by this discussion, but the spec change makes a blunt statement that pretends that these nuances do not exist. If I were reading this spec for the first time, I would still have the exact same questions I raised here, because the wording does not answer any of them. |
This is a great feedback. Let me do another round and see if I can add more clarity and capture these nuances. |
731bcd9
to
2e30032
Compare
2e30032
to
1343ea0
Compare
@yurishkuro @jmacd I added some wording to address your comments. Please have another look and see if it helps. |
CI failure doesn't seem to be related to my change in any way. |
1343ea0
to
118d565
Compare
Reworded, should be OK now. Please review again.
@jmacd PTAL. |
Tried to merge but it won't allow me, as there's one outdated unresolved conversation - which cannot be found now, hence cannot be resolved. Wondering if anybody has experienced this feature before? |
Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map. There was a recent confusion about this. This change clarifies the spec. Resolves open-telemetry#2245
f9b64a0
to
4413006
Compare
…#2248) Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map. There was a recent confusion about this. This change clarifies the spec. Resolves open-telemetry#2245
Attributes keys must be unique. The key/value pair collections in the specification was always intended to model a map.
There was a recent confusion about this. This change clarifies the spec. This is an editorial change. It does not change any semantics.
Resolves #2245