Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messaging consumer/client (group) ID could be made more generic #2015

Closed
Oberon00 opened this issue Oct 13, 2021 · 7 comments · Fixed by #3336
Closed

Messaging consumer/client (group) ID could be made more generic #2015

Oberon00 opened this issue Oct 13, 2021 · 7 comments · Fixed by #3336
Assignees
Labels
area:semantic-conventions Related to semantic conventions semconv:messaging spec:trace Related to the specification/trace directory

Comments

@Oberon00
Copy link
Member

Oberon00 commented Oct 13, 2021

Currently we have these semantic span attributes to identify message senders/consumers:

It seems like these could all be merged into a generic messaging.client_id and messaging.client_group.

See also #1904 (comment) and #1810 (comment).

@Oberon00 Oberon00 added area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory semconv:messaging labels Oct 13, 2021
@kenfinnigan
Copy link
Member

This would be a good candidate to include in the messaging WG discussions. @pyohannes?

@pyohannes
Copy link
Contributor

For easier reference, here's an updated list of attributes pertaining to this discussion:

  • messaging.consumer.id
  • messaging.kafka.consumer.group
  • messaging.kafka.client_id
  • messaging.rocketmq.client_id
  • messaging.rocketmq.client_group

A consumer group usually defines a logical "view" of a topic or similar kind, whereas a client id uniquely identifies a running instance. I think it makes sense to use separate terms (client/consumer) here.

As far as I understand, a client_group for RocketMQ is similar to a consumer group and should probably be renamed to be consistent. I'm not a RocketMQ expert though. That said, attributes in the global messaging namespace are supposed to be applicable to all or most messaging systems, and the concept of a consumer (or client) group is not, as it only applies to checkpoint-based messaging systems (or, in other words, to "topics" but not to "queues"). Therefore, I think those attributes should be kept in system-specific namespaces.

Client id, on the other hand, could be applicable to any messaging system, so I think it would make sense to move it to the generic messaging namespace.

The consumer.id mixes both concepts of consumer groups and client ids and can have different semantics depending on the messaging system (for Kafka it could just be a consumer group id, for RabbitMQ it can be a client id). I wonder if we shouldn't remove the attribute in order to achieve a clear separation and clean semantics. No information would be lost in any case, as the client id and the consumer group that make up the consumer.id attribute are present in other separate attributes.

To summarize, I'd suggest replacing the above-mentioned attributes with the following:

  • messaging.client_id
  • messaging.kafka.consumer.group
  • messaging.rocketmq.consumer.group

@SergeyKanzhelev SergeyKanzhelev removed their assignment Feb 18, 2023
@pyohannes pyohannes self-assigned this Mar 16, 2023
@pyohannes
Copy link
Contributor

@kenfinnigan As you introduced the messaging.consumer.id attribute, could you have a look at the proposal in the previous comment and let us know if that would make sense for you?

@kenfinnigan
Copy link
Member

The one concern I have is identifying Kafka messages generically by messaging.client_id when there is no client id for the Kafka consumer, as the field would be empty. This was one of the reasons why for Kafka the messaging.consumer.id is either a combination of consumer group and client id, or only consumer group.

@pyohannes
Copy link
Contributor

The one concern I have is identifying Kafka messages generically by messaging.client_id when there is no client id for the Kafka consumer, as the field would be empty.

Do you think that in this instance, it would be feasible to look at a tuple of attributes: messaging.client_id, and messaging.kafka.consumer.group?

The problem with the current definition of consumer.id is, that is neither uniquely identifies a client, nor are semantics consistent across messaging systems.

@kenfinnigan
Copy link
Member

We can consider a tuple of attributes for Kafka, but we then have the same problem in that messaging.client_id on its own is not sufficient to uniquely identify a client, as there are caveats for some systems, granted possibly only applying to Kafka

@pyohannes
Copy link
Contributor

[...] but we then have the same problem in that messaging.client_id on its own is not sufficient to uniquely identify a client, as there are caveats for some systems, granted possibly only applying to Kafka

True, it doesn't uniquely identify a client, because it might be missing for some cases. However, the semantics are clearly defined across systems: if it is given, then it uniquely identifies a client.

As far as I understand, the current consumer.id can be misleading in cases where it falls back to the consumer group, as different clients using the same consumer group will have the same consumer.id.

If there are no strong blocking reasons from your side, I will submit a PR with the proposed changes above.

carlosalberto pushed a commit that referenced this issue Apr 24, 2023
…generic (#3336)

Fixes #2015

## Changes

Based on discussions in the messaging workgroup and in issue #2015, this
PR proposes to remove `messaging.consumer.id`, and to replace both
`messaging.kafka.client_id` and `messaging.rocketmq.client_id` with a
generic `messaging.client_id`.

`messaging.consumer.id` is defined to always be set to the `client_id`
of the used messaging system, except for Kafka, where it was defined to
be a combination of `messaging.kafka.client_id` and
`messaging.kafka.consumer.group`, or just the latter if
`messaging.kafka.client_id` is not available. With this definition, the
semantics of `consumer.id` are different between messaging systems, and
even different for different Kafka scenarios.

The proposed `messaging.client_id` has consistent semantics ("an unique
client id, when it is available"), and can be used instead of
`messaging.consumer.id` in almost all cases.

In addition to have consistent semantics, this also simplifies the
semantic conventions, as instead of

`messaging.consumer.id`
`messaging.kafka.client_id`
`messaging.rocketmq.client_id`

there is now just:

`messaging.client_id`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions semconv:messaging spec:trace Related to the specification/trace directory
Projects
Status: V1 - Stable Semantics
4 participants