Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messaging: per-message tracing when sending batches #1187

Closed
lmolkova opened this issue Aug 4, 2022 · 30 comments
Closed

Messaging: per-message tracing when sending batches #1187

lmolkova opened this issue Aug 4, 2022 · 30 comments

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Aug 4, 2022

In Messaging Instrumentation WG, we're looking for the proper way to trace multiple messages sent within a single batch.
We do not have a concept for this in tracing spec and want to hear opinions on the options we came up with.

E.g. a user sends a batch of messages like producer.send([msg1, msg2]), then this batch is reshuffled on the broker and then each message is sent to consumer(s) as a part of another batch.

In this case, users should still be able to trace individual messages through the system. To achieve it, we need a unique context per message that's propagated from producer to consumer.

Options:

  1. Span per message: send span has links to each message span.
    • Pros: fits into a current mental model
    • Cons:
      • span duration is artificial: it's either 0 (message creation) or tight to send span duration. It can potentially measure when each message is sent (but many systems get ack on batch, not per message)
      • extra span collection with corresponding perf hit and storage costs increase
      • [EDIT]: Can't change this choice later in non-breaking manner

image

  1. Spanless context. Only create a new span context per message. send span has links to each message span context.
    • Pros:
      • no extra costs for span creation. all necessary information will go through links
      • [EDIT]: can report span (or event with context semantics) later in the future versions of the spec if spanless context will be proven difficult to use. This won't be breaking.
    • Cons:
      • new mental model. There is a context with no span ever created
      • maybe more complicated indexing for links on the backends. I.e. there will be a link from the producer and a link from the consumer to the spanless context

image

More context: https://docs.google.com/document/d/1OrHsepd6GjzXKll1ggZyx1jBQd0d_t8NZXT1ZOem7D0/edit#heading=h.hfmrnf56kiuf

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 4, 2022

@Oberon00
Copy link
Member

Oberon00 commented Aug 4, 2022

Have you thought about how this would be represented in the protocol? Would you then send a list of links along with the list of spans? The links could also be interpreted as "data-less spans" that have no timestamps (maybe a creation time?), and no attributes beside trace ID, span ID, parent span ID.

CC @discostu105: I think at Dynytrace we have been using a concept very similar to these spanless contexts ("links") but IIUC we are in the process of getting rid of them.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 4, 2022

Have you thought about how this would be represented in the protocol?

The same way as today (in both cases) - using span.links that already have all the properties we need (linked context and attributes). Or did I misunderstand your question?

@Oberon00
Copy link
Member

Oberon00 commented Aug 8, 2022

So you need attributes? Then it's not a pure "context".

@joaopgrassi
Copy link
Member

@lmolkova please correct me if I got it wrong, but I think the idea is that we have a "context" (no attributes/no real span) that's inside the message. Then on Send/Publish, we add a link to each message, where the link points to the "context" inside the message. The link then would hold the message-specific attributes, such as message.id or message.destination since we can't add those to the Send/Publish span.

@MSNev
Copy link
Contributor

MSNev commented Aug 9, 2022

FYI from spec meeting this morning re an Events API support on Logs open-telemetry/opentelemetry-specification#2676

@Oberon00
Copy link
Member

Oberon00 commented Aug 9, 2022

One idea that I wanted to bring up is to use zero-duration-spans after all and use a special span kind like "DOWNLINK". That way you could hide them / collapse them into the parent in an UI. Of course, this would need special support by the backend, but so would anything else we discuss here.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 9, 2022

@Oberon00 agree that if we follow 0-span duration path, backends need some heuristics that tell it's a span created for message. I believe we can do it with PRODUCER kind (and publish span, in this case, is just CLIENT).

Still, such a span should have 0-duration, no status, no attributes, no links or events and it raises the question if it should be a span or something else.

With a pure context, we keep the door open to adjust to real-world feedback. We can add an event or a span later in messaging v1.X in a non-breaking manner.

Picking event with context semantics or span will be a final decision.

@Oberon00
Copy link
Member

Oberon00 commented Aug 9, 2022

I thought attributes were actually needed on these?

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 9, 2022

Attributes should be on links to message context, not on spans.

Reasons:

  • messages can be forwarded between brokers preserving the context (i.e. span is created on the first service, but not on the next hop)
  • users can put context on messages and we don't want to require users to set attributes and follow messaging semantics, we want to offload it to auto-instrumentations as much as we can.

@Oberon00
Copy link
Member

Oberon00 commented Aug 9, 2022

Still, such a span should have 0-duration, no status, no attributes, no links or events

Attributes should be on links to message context, not on spans.

I don't understand. If we used zero-duration spans, of course it would make sense to put the attributes on the spans, and have the publish span as parent, and not using zero-duration spans and links at the sender side at the same time.

Of course the message that is sent to the broker would only contain a pure context, completely independent of how this is implemented in OTel.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 9, 2022

I don't understand. If we used zero-duration spans, of course it would make sense to put the attributes on the spans, and have the publish span as parent, and not using zero-duration spans and links at the sender side at the same time.

Imagine I create a message on service A and publish it to Kafka topic. Service B receives it and forwards it to service C via another Kafka cluster/topic. It's quite a common scenario and there are many tools that do it.
You can only create message span on the service A, but where would you put message-specific attributes on service B? They have changed - it's a new topic and cluster

Or imagine I'm a user and keep source context in which blob/DB record was created in record metadata. I want to use this context as my message context and stamp it on the message manually. Auto-instrumentation that publishes this message cannot create a span and override message context. Where would you put attributes if there is no span? Asking user to create this span is not a great experience.

The answer to both cases - put them on links.

@Oberon00
Copy link
Member

Oberon00 commented Aug 10, 2022

You can only create message span on the service A, but where would you put message-specific attributes on service B? They have changed - it's a new topic and cluster

I might have completely misunderstood the design proposed here. I have a hunch what you might mean now, but I'm still not sure. So let me ask this: Why can't service B not create a messaging span? Since service B is not an intermediary, but a service, it ought to create a span with the incoming message as parent (or with a link to it) and modify the span context on the message with the span context of the the create/publish span of the new message publication. Am I wrong here? If service B cannot modify the context on the message, it will be impossible to tell from the trace structure if anything you link to the context on the message happened in service A or service B, and in which causal/happens-before relationship.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 10, 2022

Why can't service B not create a messaging span?

Service B can create a span, but then it has to modify message context as well. Now let's assume ServiceB is a broker or, in a more popular case, an extra app layer that does geo-replication. While it could create a processing span, then create a new span for the message and modify the context, it'll be inefficient and verbose for the case of simple forwarding/routing/sharding.

Moreover, assuming ServiceB is a broker, its telemetry could belong to the cloud provider it's managed by. Creating such spans would break causality.

So the rule of thumb we came up with: if messaging library/system got a message with context (forwarded from somewhere else or set by user) - it must not create a span for message (or a new context). This context should be de-facto immutable.

Now causality without message span is achieved through links.
If message context is created:

  • by publish call instrumentation: 1) we have a link to it 2) we know it's actually a sibling of a publish span
  • by the user - it's all in the user's hands if to create a message span or get context from somewhere else and take case of cuasality

In either case, we still have publish span on every hop that is linked to this context. You can follow along and see message received on ServiceB and republished there via links.

@Oberon00
Copy link
Member

Moreover, assuming ServiceB is a broker, its telemetry could belong to the cloud provider it's managed by. Creating such spans would break causality.

This multi-tenant/multi-vendor problem can & should be solved with per-tenant/per-vendor tracestate entries. I think we should keep that discussion separate. open-telemetry/opentelemetry-specification#366 (comment)

So the rule of thumb we came up with: if messaging library/system got a message with context (forwarded from somewhere else or set by user) - it must not create a span for message (or a new context). This context should be de-facto immutable.

To clarify: Of course the library/system would create a (publish) span, but it should not not inject that span's context into the message. Is that what you mean?

I think this is a general new propagation design that you propose here, and I don't see how this is specific to messages. You could apply the same strategy to HTTP requests, which may also pass multiple hops (e.g. consider AWS Lambda which you usually invoke via a service called API gateway proxy, or Google Cloud Functions, which are behind a load balancer that actually participates in the W3C trace, trashing your span IDs, see open-telemetry/opentelemetry-specification#1852 (comment))

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 10, 2022

This multi-tenant/multi-vendor problem can & should be solved with per-tenant/per-vendor tracestate entries.

Sure, but let's make sure we keep the routing/replication/forwarding/sharing discussion on. Service-meshes would be first to hit the problem here.

Of course the library/system would create a (publish) span, but it should not inject that span's context into the message. Is that what you mean?

Correct, in batch send, publish span context cannot be put on messages - if it does, they would not be individually traceable.
I mean that message creation belongs to the application and the application can inject the context that it wants.

Auto-instrumentation should allow applications to associate a custom context with the message.
If we allow this, the next immediate conclusion would be that auto-instrumentations MUST NOT override this context, therefore MUST NOT create a message span when the context is present on the message already.

I don't see how this is specific to messages.

It's specific to messages since:

  • you can have batch send/receive
  • messages spend potentially significant time in queues
  • message processing represent application logic, while sending this messages through multiple hops is mostly irrelevant

The key difference here that for HTTP that request content is tightly coupled to the transport call and new call requires a new message, for messaging it's not the case.

Assuming everything would have a span, forwarding A->B->C scenario would look like this:

  • A: message span s1
  • A: message span s2
  • A: send batch (with links to s1, s2)
  • B: receive batch (with links to s1, s2)
  • B: message span child-of-s1
  • B: message span child-of-s2
  • B: send batch (with links to child-of-s1, child-of-s2)
  • C: receive batch (with links to child-of-s1, child-of-s2)
  • ...

(would you like that for every service mesh instrumentation?)

Without context modification:

  • A: message context s1
  • A: message context s2
  • A: send batch (with links to s1, s2)
  • B: receive batch (with links to s1, s2)
  • B: send batch (with links to s1, s2)
  • C: receive batch (with links to s1, s2)
  • ...

Both of these options carry the same information, but the first one is much more verbose. So what's the benefit?

@Oberon00
Copy link
Member

Oberon00 commented Aug 12, 2022

message processing represent application logic, while sending this messages through multiple hops is mostly irrelevant

The same could be said about HTTP: The ultimate handler of the HTTP request contains application logic while any (reverse) proxies in-between are less relevant.

Both of these options carry the same information, but the first one is much more verbose. So what's the benefit?

They do not. In the first scenario, you have the relationship A -> B -> C, and in the second one you only have A -> B and A -> C, i.e. a direct connection from A to both B and C, and only an indirect and undirected connection between B and C over the common parent A. You have no idea whether B forwarded to C, C forwarded to B, or A sent to B and C simultaneously (though the latter would be the most direct interpretation of the trace structure). That's what I meant by the loss of causal/happens-before relationships.

I want to bring up that we seem to discuss two mostly orthogonal topics in this issue:

  1. span-less contexts / spans with multiple contexts / downlinks
  2. A context propagation style where subsequent participants of a request handling chain become siblings of each other (and direct children of the initial client) instead of each participant becoming a child of the previous.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 12, 2022

The same could be said about HTTP: The ultimate handler of the HTTP request contains application logic while any (reverse) proxies in-between are less relevant.

Perfect observation. So brokers and forwarders are like HTTP proxies and load balancers. They probably don't emit any traces, and when they do, they probably should not change HTTP headers, otherwise traces become too verbose.

You have no idea whether B forwarded to C, C forwarded to B, or A sent to B and C simultaneously (though the latter would be the most direct interpretation of the trace structure)

It's a fair point. At the same time, the moment you introduce batching, you lose causality because links don't provide it.

A: message s1
A: message s2
A: message sN, ...
A: publish links to s1, s2
A: publish links to sN, ...
B: receive (new trace) links to s1,sN

By looking at this the only way to tell that A called B is by timestamps.
The only way to achieve causality is to force users to create child spans (per message) on consumers. But auto-instrumentations can't guarantee it. And some scenarios (e.g. I aggregate data from batch) don't separate messages at all.

I want to bring up that we seem to discuss two mostly orthogonal topics in this issue:

  1. span-less contexts / spans with multiple contexts / downlink
  2. A context propagation style where subsequent participants of a request handling chain become siblings of each other (and direct children of the initial client) instead of each participant becoming a child of the previous.

Agreed, but they are related to some extent.

To your second point - there are no siblings - they are all independent traces related via links.
And please, don't discard the hard requirement: if a user provided a context in the message, auto-instrumentation or broker cannot override it. Please notice that it means infra pieces that carry this instance of message over to the consumer cannot create message spans.

@Oberon00
Copy link
Member

OK, so you are saying, in your first scenario, B not only does not modify the context, it also does not emit any telemetry items at all? If that's the case, I misunderstood that.

To your second point - there are no siblings - they are all independent traces related via links.

But if there is a trace, there has to be a span. So now I'm a bit confused what is actually meant here.

@lmolkova
Copy link
Contributor Author

lmolkova commented Aug 16, 2022

But if there is a trace, there has to be a span.

I don't think this is a precise statement.

There is a span, but it's a transport span that sends (a batch) to broker or receives a batch from broker. When we receive a batch, we can't always create span per message in auto-inst, it's app responsibility to create it if they want. We can only guarantee a receive span that links to each context in a batch.
Assuming you carry immutable messages over through multiple hops, there is no point in creating spans for each message, you just create links to them.

I.e. messages belong to application, application properties on the message are immutable for brokers and infra, they must not be modified. I.e. message trace context cannot be modified and no span must be created to re-trace this instance of message.
New spans are created to trace the transport of this message and they have links to the context on the messages.

@spanglerco
Copy link

To share another use case related to this discussion, we have a service that produces Kafka messages in a transaction as a large batch to a single topic. But that batch could be thousands of messages, so adding a link per message to a single span is not feasible, as links are normally limited to 128 if I understand correctly. Similarly, the array tag approach would result in a very large tag value. I wonder if the conventions could also provide semantics for a single span representing a batch of produced messages at a cost of losing granularity in the trace. Or maybe that's already addressed somewhere and I missed it.

@pyohannes
Copy link
Contributor

@lmolkova Do you see this resolved with the integration of #284?

We introduced a requirement for attributes and links and we went with option 1 from your initial proposal (one span per message. where possible).

@lmolkova
Copy link
Contributor Author

@lmolkova Do you see this resolved with the integration of open-telemetry/semantic-conventions#284?

We introduced a requirement for attributes and links and we went with option 1 from your initial proposal (one span per message. where possible).

yes, closing this one.

PS: I still like spanless contexts more 🙃

@lmolkova lmolkova transferred this issue from open-telemetry/opentelemetry-specification Jun 27, 2024
@lmolkova
Copy link
Contributor Author

Reopening based on the feedback from @tedsuo to discuss zero-duration spans. Will bring it up on messaging SIG 6/27

@lmolkova lmolkova reopened this Jun 27, 2024
@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 27, 2024

capturing some feedback points:

  • zero-duration spans smell - they probably should be events
  • creation of a message (or injection of a unique context) is not a work to be reported as a span
  • link to context is valid (even if there is no span)
  • links that point to nothing should have a timestamp and a name

@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 27, 2024

Discussed at messaging SIG:

  • spanless context is still controversial: there is a worry that it'd break backends
  • we're losing more than timestamp and name - we're loosing causality - the link does not record the parent of the context. e.g.
    • Incoming HTTP request had traceId1 spanId1
    • Message created in scope of it had traceId1, spanId2
      • if we recorded a span, we'd record that parent of message is spanId1
      • without the span we have no means to record that spanId2 is a child of spanId1
  • are there other similar cases where we need the new context but not a new span?

We should look for more options:

  1. Inject parent context into messages.
    • Cons: can't distinguish messages that were created in the same context.
  2. Legalize (explain) zero-duration spans. Span is more than duration/status. It's new context, causality, name, timestamp, semantics.
  3. Evolve event/link combination.
    • Pros: event (with parent context) that has a secondary context (new message) describes a single point in time when the message was created.
    • Cons: this is effectively a new signal (links expressed as events)

Will bring it up on spec meeting.

@trask
Copy link
Member

trask commented Jun 28, 2024

Evolve event/link combination.
Pros: event (with parent context) that has a secondary context (new message) describes a single point in time when the message was created.

I'm probably missing something here, how do you find the parent of the "new message" context?

@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 28, 2024

I'm probably missing something here, how do you find the parent of the "new message" context?

image

I think event is an unfortunate term - we don't want to build it on top of span events, but it's not a log (no payload).

This thing is a link detached from the span. It has

  • parent context
  • it's own unique span id
  • name
  • timestamp
  • attributes

I.e. from the data structure it's a lightweight span without status, duration, links, or events.

@joaopgrassi
Copy link
Member

@lmolkova given we have the conventions now mentioning the create context and IIRC "zero duration" spans are not a big deal, do we have anything left to do in this issue? It seems to me all is "resolved" now? Or am I missing something?

@lmolkova
Copy link
Contributor Author

lmolkova commented Oct 1, 2024

yeah, I think we can close it - we have #1273 to track remaining work (making per-message tracing disableable). Thanks!

@lmolkova lmolkova closed this as completed Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: V1 - Stable Semantics
Development

No branches or pull requests

8 participants