Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge SpanContext into Span #1033

Closed
wants to merge 13 commits into from

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Sep 29, 2020

Fixes #999

Changes

This removes SpanContext from the API. Reasoning can be found in #999 and #1022. This PR is a more strict version of #1022 by removing SpanContext completely instead of making it optional - this was recommended to me as a better alternative at the maintainers meeting (which I heard about via @trask, thanks!).

Prototype PR in Java open-telemetry/opentelemetry-java#1712

@anuraaga anuraaga requested review from a team September 29, 2020 07:02
@anuraaga
Copy link
Contributor Author

A few points came up at the maintainers meeting which I'll comment on here proactively

  • Logs injection may benefit from being able to inject an object (SpanContext)

At least in Java, the only logs injection techniques we know of, logging context in log4j and MDC in logback / slf4j, only accept Strings. While other techniques like custom message rendering, or other libraries, may support objects, we expect these to continue to be String as the lowest common denominator.

  • We may want to group into a SpanContext for future use cases, perhaps exemplars

We don't currently have a concrete idea here - removing SpanContext we find simplifies the API significantly and should be given a good thinking-over. If a future use case for bundling identifiers into one comes up, we can implement it later, generally efficiently by e.g., exposing an interface with the identifiers that is still implemented by the same Span object in Java. Something that comes to mind is our "cross-cutting concerns" are defined on the Context - Context only accepts Span so we may be able to expect Span will be a good interface in general for this sort of integration too.

@anuraaga anuraaga force-pushed the remove-spancontext branch 2 times, most recently from 1ad915a to 8315375 Compare September 29, 2020 07:12
Copy link
Contributor

@iNikem iNikem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I still support the idea of demoting/removing SpanContext as a separate entity/object, reading this PR made me wonder... Maybe we still need SpanContext as an aggregate of span identification. Just to be a convenient handle

specification/overview.md Outdated Show resolved Hide resolved
@@ -371,8 +367,8 @@ When a new `Span` is created from a `Context`, the `Context` may contain a `Span
representing the currently active instance, and will be used as parent.
If there is no `Span` in the `Context`, the newly created `Span` will be a root span.

A `SpanContext` cannot be set as active in a `Context` directly, but through the use
of a [Propagated Span](#propagated-span-creation) wrapping it.
A remote span's [span identifiers](#span-identifiers) can be set as active in a `Context`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels especially clumsy. How can span identifiers be set anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some clumsiness inherited from the previous wording too, one confusion I had is that since we make spans active, it's a bit mysterious to also have spancontexts that are active.

What do you think of just "A remote span can be set as active in a Context"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we can remove this sentence now without replacement.

Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com>
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually like that proposal, I am not sure how practical it is though yet. And a class like SpanContext may be useful at least in the implementation.

non-zero TraceID and a non-zero SpanID, MUST be provided.

### IsRemote

An API called `IsRemote`, that returns a boolean value, which is `true` if the SpanContext was
An API called `IsRemote`, that returns a boolean value, which is `true` if the Span was
propagated from a remote parent, MUST be provided.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A span cannot be propagated from a remote parent unless it's a "Propagated span". We should clarify this. E.g. "A remote span is never recording", otherwise we re-open the discussion at #359.

@@ -385,7 +381,7 @@ description](../overview.md#links-between-spans).

A `Link` is defined by the following properties:

- (Required) `SpanContext` of the `Span` to link to.
- (Required) the `Span` to link to or its [span identifiers](#span-identifiers).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the meaning. SpanContext also contains TraceState (which might be considered part of the identifier for proprietary systems) and TraceFlags (which are definitely not part of the identifier).

@@ -166,14 +165,12 @@ The `Tracer` MAY provide functions to:

These functions MUST delegate to the `Tracing Context Utilities`.

## SpanContext
## Span Identifiers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call the TraceId part of the identity.

These parts of the span serve to identify it, both in-process and remotely. They are immutable
pieces of the `Span`, as opposed to the data collecting [span operations](#span-operations).
These identifiers conform to the [W3C TraceContext specification](https://www.w3.org/TR/trace-context/).
It contains two identifiers - a `TraceId` and a `SpanId` - along with a set of common
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing -- it says that identifiers contains identifiers (and more stuff). So now we have two different definitions for "identifiers".

The OpenTelemetry `SpanContext` representation conforms to the [W3C TraceContext
specification](https://www.w3.org/TR/trace-context/). It contains two
identifiers - a `TraceId` and a `SpanId` - along with a set of common
These parts of the span serve to identify it, both in-process and remotely. They are immutable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do more than just identify it. This information is what is required to connect a parent span to a child span. For just identifying trace+span ID are usually enough (though some vendors might require a part of the tracestate).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took advantage of this line in the previous spec referring to SpanContext as an identifier

"An immutable SpanContext that uniquely identifies the Span"

But I agree the name isn't so great, just couldn't come up with a better one. It may be a reason to keep SpanContext as a concept indeed though ideally not implementation requirement as in #999.

I'll hold off on significant edits to this one until getting some more ideas on which to go with.

specification/trace/api.md Show resolved Hide resolved
specification/trace/sdk.md Show resolved Hide resolved
Anuraag Agrawal added 2 commits September 29, 2020 18:18
@bogdandrutu
Copy link
Member

@anuraaga

Logs injection may benefit from being able to inject an object (SpanContext)
At least in Java, the only logs injection techniques we know of, logging context in log4j and MDC in logback / slf4j, only accept Strings. While other techniques like custom message rendering, or other libraries, may support objects, we expect these to continue to be String as the lowest common denominator.

Log4j 2 has this capability already https://logging.apache.org/log4j/2.x/log4j-api/apidocs/org/apache/logging/log4j/util/StringMap.html

We may want to group into a SpanContext for future use cases, perhaps exemplars
We don't currently have a concrete idea here - removing SpanContext we find simplifies the API significantly and should be given a good thinking-over. If a future use case for bundling identifiers into one comes up, we can implement it later, generally efficiently by e.g., exposing an interface with the identifiers that is still implemented by the same Span object in Java. Something that comes to mind is our "cross-cutting concerns" are defined on the Context - Context only accepts Span so we may be able to expect Span will be a good interface in general for this sort of integration too.

You can check open-census views + exemplar for that

@anuraaga
Copy link
Contributor Author

@bogdandrutu

Log4j 2 has this capability already https://logging.apache.org/log4j/2.x/log4j-api/apidocs/org/apache/logging/log4j/util/StringMap.html

Thanks, we only use a version of the contextdataprovider that returns a Map<String, String> but didn't notice StringMap was more generalized. Our current thinking though is that injecting strings is already efficient when using strings as for the IDs anyways, and for the way we inject right now using the same context IDs for all the frameworks, we benefit from injecting split rather than one object.

As a general point, the key one is in #1033 (comment) where we may find a need to bring SpanContext back later, but that should be possible in a compatible way that still allows the general UX of not having to worry about it when interacting with a Span to hold. But since we don't really use SpanContext right now, as we now use Context almost everywhere it used to be, we can expect users to not need it either. I'd like to propose we remove it for now, and only bring it back later if we find a nice use for it - presumably that time we wouldn't overload the term Context.

There are definitely naming issues, like whether "Span Identifiers" makes sense for this that I need help with 😅 If there's general interest in removing the extra type, hope to tweak the naming / explanations.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 on the overall idea. I think we can remove this concept for the moment.

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 2, 2020

I have removed the concept of "span identifiers" completely, it seems ok to reorder and inline definition of the traceid, tracestate, etc. @Oberon00 What do you think?

specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Show resolved Hide resolved
context for the newly created `Span`. This means that a `SpanContext` that has been provided by a configured `Propagator`
will be propagated through to any child span, but that no new `SpanContext`s will be created.
* No valid `SpanContext` is specified as the parent of the new `Span`: The API MUST create an non-valid
* A valid `Span` is specified as the parent of the new `Span`: The API MUST treat this parent context as the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related #1023

specification/trace/api.md Show resolved Hide resolved
Anuraag Agrawal and others added 3 commits October 2, 2020 18:55
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
@Oberon00 Oberon00 changed the title Remove SpanContext in favor of having identifiers directly on the Spa… Merge SpanContext into Span Oct 2, 2020
@carlosalberto
Copy link
Contributor

Prototype PR in Python: open-telemetry/opentelemetry-python#1191

Bringing (some of) the comments I left there:

  • This prototype required changes that have been only very recently merged to the Specification (such as Define Propagation-only Span to simplify active Span logic in Context. #994). Although we have prototypes for the aforementioned changes, we actually won't know how good it all these recent changes will play along with this specific change, and there's the general possibility that this may become a can of warms. Note: Java already had this implemented, which simplified things there.
  • The change itself is neither a big issue nor a big improvement for Python overall. It won't break much existing user's code, but it doesn't bring anything useful to the table IMHO. Note: I got some informal feedback from the Python maintainers, and they have similar feelings about this.

In general, because of the fact that this doesn't really improve the Python API, and because of feedback I heard regarding the fact we have been rushed things lately, my opinion is a NO regarding going forward with this (and thus keep the original decision of having it as "After-GA"), given a) the apparent lack of interest/need of other people picking up these in other languages and b) the fact we considered this in the past and decided not to go with it at the time.

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 3, 2020

In general, because of the fact that this doesn't really improve the Python API

I think the simplification coming from the reduction of concepts that aren't really providing functionality is being understated. It seems like for a user, having less concepts to understand allows the API to be a more user-friendly one.

The feedback I've gotten on the Java side has been quite positive. If a spec change can cause different reactions per language, maybe it's related to the idioms of the languages. In that case, it seems overspec'd and #1022 could be a middle-ground to allow languages to decide the API they want to present.

the fact we considered this in the past and decided not to go with it at the time.

Things have changed hugely - SpanContext is almost unused because we use Context for all parenting. This has left us with SpanContext in Span in Context - the terminology here is not intuitive, and results in code along the lines of ContextUtils.getSpan(Context.current()).getContext() - context is being overloaded in a non-productive way, isn't it? Given these large changes, I think it's worth a fresh look.

and thus keep the original decision of having it as "After-GA")

We can add APIs after GA but can't remove them, so there isn't any "After-GA" decision possible here. One of the main reasons for discussing this now is because we can add SpanContext, or something similar, back later if we find it pulls a lot of weight but can't remove it later.

But I do understand the concern about it being too close to release - so I thought #1022 with optionally merging the APIS could be more practical, but there seemed to be more interest in this version that isn't optional.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think removing this unnecessary concept is a really nice win for new users, simplifying their mental model, and eliminating confusion between the similarly named "SpanContext" and "Context".

As @anuraaga mentions, we can't remove things after GA, so we either do it now or never.

@carlosalberto
Copy link
Contributor

I think the simplification coming from the reduction of concepts that aren't really providing functionality is being understated. It seems like for a user, having less concepts to understand allows the API to be a more user-friendly one.

However, SpanContext is core part of not only of the OpenTelemetry, OpenCensus and OpenTracing projects, but also in Brave/Zipkin (under the TraceContext name). So this is not a new class/concept.

This has left us with SpanContext in Span in Context - the terminology here is not intuitive

This point has been applied to many items in the past. Rushing a new change when we are about to go GA is not something I'd like to have. As I said, there's no interest, and it's only helpful, so far, in the Java world. This is a mistake we commit a few times back in Open Tracing (rushing features because it helped a specific case/language), and I really don't think this is worth the risk.

We can add APIs after GA but can't remove them, so there isn't any "After-GA" decision possible here.

As I said, I think the notion of SpanContext is a well understood one.

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 4, 2020

So this is not a new class/concept.

Only OpenCensus had both Context and SpanContext. And only OTel has gone as far as it has to remove usage of SpanContext. I would say that here SpanContext has been superceded by Context.

As I said, there's no interest, and it's only helpful

The similar PR optionally removing it has an approval from a TC member and there have been lots of folks suggesting this seems like a good idea - that doesn't seem like at least some interest.

This is a mistake we commit a few times back in Open Tracing (rushing features because it helped a specific case/language),

I understand the worry about the timing, hence the simpler #1022 option. But if there is a language that is being unable to present a simplified API to users because of the spec, then it seems like a case of over spec. We're not adding a feature here, we're reducing the scope of the spec, something that can only happen now.

@jmacd Do you have any suggestion on how to proceed?

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 5, 2020

@yurishkuro Let me go ahead and copy in your comment here where the brunt of the discussion is happening.

I would disagree with this, mainly because of the Extract() function: if Span is an active, writeable object, then Extract() fundamentally cannot return it, since it belongs to the caller process. The recent addition of PropagatedOnlySpan is quite inelegant, because it relies on runtime behavior of the span to ignore write methods, rather than on preventing them at the API level. Pretending that a Span is what's being propagated over the wire might confuse users even more (I actually met some users at conferences and chat rooms thinking that distributed tracing works by passing all the captured data with the requests).

If I had to choose between keeping only one of Span and SpanContext, I would remove the Span.

I feel as if we've already jumped the fence with regards to "Pretending that a Span is what's being propagated over the wire" - all of our propagators are provided a span inside a context. The span's only readable field is the SpanContext, but I guess logically there's no real difference between this and the span's only readable fields being its trace ID, etc.

I definitely agree with your choice. @jmacd has also talked about this. But if even removing SpanContext is causing a challenge right now, then removing Span is definitely too huge a step. So Span stays until a potential future major version upgrade in the future - in the meantime, I do feel like the simplification of removing SpanContext to reflect where we've come (removed SpanContext usage for parenting) is worth it.

@jmacd
Copy link
Contributor

jmacd commented Oct 5, 2020

This was discussed at the TC meeting today.

The consensus was that #969 has opened up the specification enough to address the technical problems faced by Java auto-instrumentation. The feeling is that #1022 and #1033 are both minor rearrangements that offer conceptual improvements at the cost of taking away a very familiar concept. We observed that it could be just the name "SpanContext" that now leads us to confusion, and that perhaps if we could simply rename "SpanContext" to "SpanIdentifiers" that this problem would be solved.

I would disagree with this, mainly because of the Extract() function: if Span is an active, writeable object, then Extract() fundamentally cannot return it, since it belongs to the caller process. The recent addition of PropagatedOnlySpan is quite inelegant, because it relies on runtime behavior of the span to ignore write methods, rather than on preventing them at the API level. Pretending that a Span is what's being propagated over the wire might confuse users even more (I actually met some users at conferences and chat rooms thinking that distributed tracing works by passing all the captured data with the requests).

I think I take issue with "active, writable object". That is not how I see a Span. I see Span as an "interface object" with methods that convey semantic information, they do not write to an object. I would not "rely on runtime behavior of the span to ignore write methods", because methods on a propagated span convey semantic information. I believe we are trying to say that the default SDK will treat methods on propagated spans as no-ops, but an alternative SDK with a different data model could decide to do something with these events.

@Oberon00
Copy link
Member

Oberon00 commented Oct 5, 2020

the default SDK will treat methods on propagated spans as no-ops, but an alternative SDK with a different data model could decide to do something with these events.

The propagated span is not intended as customization point for SDKs. It is meant to be fully implemented in the API only:

This functionality MUST be fully implemented in the API, and SHOULD NOT be overridable.

-- https://github.com/open-telemetry/opentelemetry-specification/blob/f081ad568379c4c56f9cacfff70d88dd9b23a725/specification/trace/api.md#propagated-span-creation

@jmacd
Copy link
Contributor

jmacd commented Oct 5, 2020

@Oberon00 Thank you for correcting me.

Well, these propagation spans, if they could be given an implementation other than no-op, would address the request in #1007. To actually support writing such spans would require something new in the data model, but as an API-level question I'd be in favor of defining these propagation spans as meaningful continuations of their parent span where the default implementation is a no-op (because the standard data model does not have this support).

@anuraaga
Copy link
Contributor Author

anuraaga commented Oct 6, 2020

@jmacd Thanks for the update - is the recommendation to consider just renaming SpanContext or drop the PR altogether?

@Oberon00
Copy link
Member

Oberon00 commented Oct 6, 2020

I'd be in favor of defining these propagation spans as meaningful continuations of their parent span

I don't think this is a good idea. That would add a dependency of instrumentation code on implementation-defined behavior of the API implementation. E.g. I have a "normal" web server instrumentation now, but if I want a B3 webserver instrumentation I have to change the instrumentation code to operate on the extracted span instead of creating a child span? That seems like suboptimal API design. Also, what would happen if there is no parent to be extracted, so you would get an empty propagated span? I would much prefer to have this handled transparently, e.g. something in StartSpan (the sampler, the ID generator) should decide to not generate a new span ID for the child span. See #1007 (comment)

@Oberon00
Copy link
Member

Oberon00 commented Oct 6, 2020

perhaps if we could simply rename "SpanContext" to "SpanIdentifiers"

I think the name SpanReference may be better. As there will be more information contained than required for identifying the span (traceflags and potentially parts of tracestate), but it is the minimum required information to reference the span as parent, or in a link.

@jmacd
Copy link
Contributor

jmacd commented Oct 6, 2020

I believe @bogdandrutu volunteered to write this rename proposal in more detail. I support SpanReference as a name.

@anuraaga
Copy link
Contributor Author

Closing this since no interest. Thanks for the reviews!

@anuraaga anuraaga closed this Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make SpanContext optional
7 participants