Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update messaging "Receive", "Deliver", and "Create" operations according to OTEP 220 #284

Merged
merged 23 commits into from
Oct 31, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
225 changes: 134 additions & 91 deletions docs/messaging/messaging-spans.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,11 @@
- [Conventions](#conventions)
* [Context propagation](#context-propagation)
* [Span name](#span-name)
* [Span kind](#span-kind)
* [Operation names](#operation-names)
* [Span kind](#span-kind)
* [Trace structure](#trace-structure)
+ [Producer spans](#producer-spans)
+ [Consumer spans](#consumer-spans)
- [Messaging attributes](#messaging-attributes)
* [Attribute namespaces](#attribute-namespaces)
* [Consumer attributes](#consumer-attributes)
Expand All @@ -28,7 +31,6 @@
- [Examples](#examples)
* [Topic with multiple consumers](#topic-with-multiple-consumers)
* [Batch receiving](#batch-receiving)
* [Batch processing](#batch-processing)
- [Semantic Conventions for specific messaging technologies](#semantic-conventions-for-specific-messaging-technologies)

<!-- tocstop -->
Expand Down Expand Up @@ -196,21 +198,78 @@ Examples:
* `AuthenticationRequest-Conversations process`
* `(anonymous) publish` (`(anonymous)` being a stable identifier for an unnamed destination)

### Span kind

A producer of a message should set the span kind to `PRODUCER` unless it synchronously waits for a response: then it should use `CLIENT`.
The processor of the message should set the kind to `CONSUMER`, unless it always sends back a reply that is directed to the producer of the message
(as opposed to e.g., a queue on which the producer happens to listen): then it should use `SERVER`.

### Operation names

The following operations related to messages are defined for these semantic conventions:

| Operation name | Description |
| -------------- | ----------- |
| `publish` | A message is sent to a destination by a message producer/client. |
| `receive` | A message is received from a destination by a message consumer/server. |
| `process` | A message that was previously received from a destination is processed by a message consumer/server. |
| `publish` | One or more messages are provided for publishing to an intermediary. If a single message is published, the context of the "Publish" span can be used as the creation context and no "Create" span needs to be created. |
| `create` | A message is created. "Create" spans always refer to a single message and are used to provide a unique creation context for messages in batch publishing scenarios. |
| `receive` | One or more messages are requested by a consumer. This operation refers to pull-based scenarios, where consumers explicitly call methods of messaging SDKs to receive messages. |
| `deliver` | One or more messages are passed to a consumer. This operation refers to push-based scenarios, where consumer register callbacks which get called by messaging SDKs. |

### Span kind

[Span kinds](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind)
SHOULD be set according to the following table, based on the operation a span describes.

| Operation name | Span kind|
|----------------|-------------|
| `publish` | `PRODUCER` if the context of the "Publish" span is used as creation context. |
| `create` | `PRODUCER` |
| `receive` | `CONSUMER` |
pyohannes marked this conversation as resolved.
Show resolved Hide resolved
| `deliver` | `CONSUMER` |

For cases not covered by the table above, the span kind should be set according
to the [generic specification about span kinds](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#spankind),
e. g. it should be set to CLIENT for the "Publish" span if its context is not
used as creation context and if the "Publish" span models a synchronous call to
the intermediary.

Setting span kinds according to this table ensures that span links between
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
consumers and producers always exist between a PRODUCER span on the producer
side and a CONSUMER span on the consumer side. This allows analysis tools to
interpret linked traces without the need for additional semantic hints.

### Trace structure

#### Producer spans

"Publish" spans SHOULD be created for operations of providing messages for
sending or publishing to an intermediary. A single "Publish" span can account
for a single message, or for multiple messages (in the case of providing
messages in batches). "Create" spans MAY be created. A single "Create" span
pyohannes marked this conversation as resolved.
Show resolved Hide resolved
SHOULD account only for a single message. "Create" spans SHOULD either be
children or links of the related "Publish" span.

If a user provides a custom creation context in a message, this context SHOULD
NOT be modified, a "Create" span SHOULD NOT be created, and the "Publish" span
SHOULD link to the custom creation context. Otherwise, if a "Create" span
exists for a message, its context SHOULD be injected into the message. If no
"Create" span exists and no custom creation context is injected into the
message, the context of the related "Publish" span SHOULD be injected into the
message.

#### Consumer spans

"Deliver" spans SHOULD be created for operations of passing messages to the
application when those operations are not initiated by the application code
(push-based scenarios). A "Deliver" span covers the duration of such an
operation, which is usually a callback or handler.

"Receive" spans SHOULD be created for operations of passing messages to the
pyohannes marked this conversation as resolved.
Show resolved Hide resolved
application when those operations are initiated by the application code
(pull-based scenarios).

"Deliver" or "Receive" spans MUST NOT be created for messages that are
pre-fetched or cached by messaging libraries or SDKs until they are forwarded
to the caller.

A single "Deliver" or "Receive" span can account for a single message, for a
lmolkova marked this conversation as resolved.
Show resolved Hide resolved
batch of messages, or for no message at all (if it is signalled that no
messages were received). For each message it accounts for, the "Deliver" or
"Receive" span SHOULD link to the message's creation context.

## Messaging attributes

Expand Down Expand Up @@ -335,19 +394,12 @@ under the namespace `messaging.destination_publish.*`
the broker doesn't have such notion, the original destination name SHOULD uniquely identify the broker.
<!-- endsemconv -->

The *receive* span is used to track the time used for receiving the message(s), whereas the *process* span(s) track the time for processing the message(s).
Note that one or multiple Spans with `messaging.operation` = `process` may often be the children of a Span with `messaging.operation` = `receive`.
The distinction between receiving and processing of messages is not always of particular interest or sometimes hidden away in a framework (see the [Message consumption](#message-consumption) section above) and therefore the attribute can be left out.
For batch receiving and processing (see the [Batch receiving](#batch-receiving) and [Batch processing](#batch-processing) examples below) in particular, the attribute SHOULD be set.
Even though in that case one might think that the processing span's kind should be `INTERNAL`, that kind MUST NOT be used.
Instead span kind should be set to either `CONSUMER` or `SERVER` according to the rules defined above.

### Per-message attributes

All messaging operations (`publish`, `receive`, `process`, or others not covered by this specification) can describe both single and/or batch of messages.
Attributes in the `messaging.message` or `messaging.{system}.message` namespace describe individual messages. For single-message operations they SHOULD be set on corresponding span.

For batch operations, per-message attributes are usually different and cannot be set on the corresponding span. In such cases the attributes MAY be set on links. See [Batch Receiving](#batch-receiving) and [Batch Processing](#batch-processing) for more information on correlation using links.
For batch operations, per-message attributes are usually different and cannot be set on the corresponding span. In such cases the attributes SHOULD be set on links. See [Batch receiving](#batch-receiving) for more information on correlation using links.

Some messaging systems (e.g., Kafka, Azure EventGrid) allow publishing a single batch of messages to different topics. In such cases, the attributes in `messaging.destination` MAY be
set on links. Instrumentations MAY set destination attributes on the span if all messages in the batch share the same destination.
Expand All @@ -365,92 +417,83 @@ All attributes that are specific for a messaging system SHOULD be populated in `

### Topic with multiple consumers

Given is a process P, that publishes a message to a topic T on messaging system MS, and two processes CA and CB, which both receive the message and process it.

```
Process P: | Span Prod1 |
--
Process CA: | Span CA1 |
--
Process CB: | Span CB1 |
Given is a publisher that publishes a message to a topic exchange "T" on RabbitMQ, and two consumers which both get the message delivered.

```mermaid
flowchart LR;
subgraph PRODUCER
direction TB
P[Span Publish A]
end
subgraph CONSUMER1
direction TB
R1[Span Deliver A 1]
end
subgraph CONSUMER2
direction TB
R2[Span Deliver A 2]
end
P-. link .-R1;
P-. link .-R2;

classDef normal fill:green
class P,R1,R2 normal
linkStyle 0,1 color:green,stroke:green
```

| Field or Attribute | Span Prod1 | Span CA1 | Span CB1 |
| Field or Attribute | Span Publish A | Span Deliver A 1| Span Deliver A 2 |
|-|-|-|-|
| Span name | `"T publish"` | `"T process"` | `"T process"` |
| Parent | | Span Prod1 | Span Prod1 |
| Links | | | |
| Span name | `T publish` | `T deliver` | `T deliver` |
| Parent | | | |
| Links | | `T publish` | `T publish` |
| SpanKind | `PRODUCER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` |
| `server.address` | `"ms"` | `"ms"` | `"ms"` |
| `server.port` | `1234` | `1234` | `1234` |
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
| `messaging.destination.name` | `"T"` | `"T"` | `"T"` |
| `messaging.operation` | | `"process"` | `"process"` |
| `messaging.message.id` | `"a1"` | `"a1"`| `"a1"` |
| `messaging.operation` | `"publish"` | `"deliver"` | `"deliver"` |
| `messaging.message.id` | `"a"` | `"a"`| `"a"` |

### Batch receiving

Given is a process P, that publishes two messages to a queue Q on messaging system MS, and a process C, which receives both of them in one batch (Span Recv1) and processes each message separately (Spans Proc1 and Proc2).

Since a span can only have one parent and the propagated trace and span IDs are not known when the receiving span is started, the receiving span will have no parent and the processing spans are correlated with the producing spans using links.

Given is a publisher that publishes two messages to a topic "Q" on Kafka, and a consumer which receives both messages in one batch.

```mermaid
flowchart LR;
subgraph PRODUCER
direction TB
PA[Span Publish A]
PB[Span Publish B]
end
subgraph CONSUMER1
direction TB
D1[Span Receive A B]
end
PA-. link .-D1;
PB-. link .-D1;

classDef normal fill:green
class PA,PB,D1 normal
linkStyle 0,1 color:green,stroke:green
```
Process P: | Span Prod1 | Span Prod2 |
--
Process C: | Span Recv1 |
| Span Proc1 |
| Span Proc2 |
```

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Proc1 | Span Proc2 |
|-|-|-|-|-|-|
| Span name | `"Q publish"` | `"Q publish"` | `"Q receive"` | `"Q process"` | `"Q process"` |
| Parent | | | | Span Recv1 | Span Recv1 |
| Links | | | | Span Prod1 | Span Prod2 |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
| `server.address` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
| `server.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
| `messaging.destination.name` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| `messaging.operation` | | | `"receive"` | `"process"` | `"process"` |
| `messaging.message.id` | `"a1"` | `"a2"` | | `"a1"` | `"a2"` |
| `messaging.batch.message_count` | | | 2 | | |

### Batch processing

Given is a process P, that publishes two messages to a queue Q on messaging system MS, and a process C, which receives them separately in two different operations (Span Recv1 and Recv2) and processes both messages in one batch (Span Proc1).

Since each span can only have one parent, C3 should not choose a random parent out of C1 and C2, but rather rely on the implicitly selected parent as defined by the [tracing API spec](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.26.0/specification/trace/api.md).
Depending on the implementation, the producing spans might still be available in the meta data of the messages and should be added to C3 as links.
The client library or application could also add the receiver span's SpanContext to the data structure it returns for each message. In this case, C3 could also add links to the receiver spans C1 and C2.

The status of the batch processing span is selected by the application. Depending on the semantics of the operation. A span status `Ok` could, for example, be set only if all messages or if just at least one were properly processed.

```
Process P: | Span Prod1 | Span Prod2 |
--
Process C: | Span Recv1 | Span Recv2 |
| Span Proc1 |
```

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Recv2 | Span Proc1 |
|-|-|-|-|-|-|
| Span name | `"Q publish"` | `"Q publish"` | `"Q receive"` | `"Q receive"` | `"Q process"` |
| Parent | | | Span Prod1 | Span Prod2 | |
| Links | | | | | [Span Prod1, Span Prod2 ] |
| Link attributes | | | | | Span Prod1: `messaging.message.id`: `"a1"` |
| | | | | | Span Prod2: `messaging.message.id`: `"a2"` |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
| `server.address` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
| `server.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
| `messaging.destination.name` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| `messaging.operation` | | | `"receive"` | `"receive"` | `"process"` |
| `messaging.message.id` | `"a1"` | `"a2"` | `"a1"` | `"a2"` | |
| `messaging.batch.message_count` | | | 1 | 1 | 2 |
| Field or Attribute | Span Publish A | Span Publish B | Span Receive A B |
|-|-|-|-|
| Span name | `Q publish` | `Q publish` | `Q receive` |
| Parent | | | |
| Links | | | Span Publish A, Span Publish B |
| Link attributes | | | Span Publish A: `messaging.message.id`: `"a1"` |
| | | | Span Publish B: `messaging.message.id`: `"a2"` |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` |
| `server.address` | `"ms"` | `"ms"` | `"ms"` |
| `server.port` | `1234` | `1234` | `1234` |
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` |
| `messaging.destination.name` | `"Q"` | `"Q"` | `"Q"` |
| `messaging.operation` | `"publish"` | `"publish"` | `"receive"` |
| `messaging.message.id` | `"a1"` | `"a2"` | |
| `messaging.batch.message_count` | | | 2 |

## Semantic Conventions for specific messaging technologies

Expand Down
Loading