Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messaging.operation.type attribute should not be required #947

Closed
lmolkova opened this issue Apr 22, 2024 · 10 comments · Fixed by #1006
Closed

Messaging.operation.type attribute should not be required #947

lmolkova opened this issue Apr 22, 2024 · 10 comments · Fixed by #1006

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Apr 22, 2024

Operation type identifies common messaging operations (which may be called differently in specific systems): e.g. publish can be called send, schedule, or enqueue.

In such cases we use messaging.operation.name to represent system-friendly name.

However there are other types of operations that don't match any common type. E.g, cancel scheduled message, delete, create topic, etc...

While those are not targeted for stability, we should make sure messaging.operation.type requirement level is adequate and future-proof.

Originally posted by @lmolkova in #942 (comment)

@lmolkova lmolkova self-assigned this Apr 22, 2024
@lmolkova
Copy link
Contributor Author

Options:

  1. Don't populate messaging.operation.type at all. We have span kind and operation name to distinguish between different operations. We only need it to distinguish operations within the same kind (both create and publish are producers and both receive and process are consumers), but is it really necessary in presence of the operation name?

  2. Make messaging.operation.type conditionally_required: when applicable

  3. Keep messaging.operation.type required, but then tell that if none of the existing values apply, it should match messaging.operation.name

@lmolkova
Copy link
Contributor Author

lmolkova commented Apr 22, 2024

My preference is on the Option 1 side - I want to reevaluate why we need operation type, especially if we're considering having
messaging.operation.duration metric (and messaging.processing.duration), see #937

Also, with messaging.operation.name used in metrics, adding a new attribute to classify common operations would not be breaking (it won't increase number of time series), so even if discover that operation type is necessary in some cases, we should still be able to add it post-stability.

@pyohannes
Copy link
Contributor

I would favor a solution where messaging.operation.type remains a required attribute.

Currently it's the only required attribute besides messaging.system, and thus important for making sense of messaging telemetry in a system-independent way. Otherwise, we'll end up with a situation where sometimes there's only operation.type, sometimes there's only operation.name, and sometimes there are both (depending on the system). This makes it harder to define workflows independent of the messaging system used.

I suggest:
4. Keep messaging.operation.type required, define a new value "undefined" to be used if none of the other values apply.
5. Keep messaging.operation.type required, define additional values in system specific conventions if none of the other values apply.

@lmolkova
Copy link
Contributor Author

Currently it's the only required attribute besides messaging.system, and thus important for making sense of messaging telemetry in a system-independent way

Can we revisit this and clarify why is it important? How exactly is it going to be used?

@pyohannes
Copy link
Contributor

Can we revisit this and clarify why is it important? How exactly is it going to be used?

It gives you a well-defined list of values which can be interpreted independently of the messaging system used. Regarding use-cases, for example it lets you ask questions like "give me spans for all settlement operations that caused an error" in a non-system specific way.

messaging.operation.type currently has five different values, it might end up having six if we add something like undefined, or seven if we add session and transaction as additional operation types. That's something one can grasp without too much cognitive overhead. Furthermore, it makes it clear what metric measures a given operation.

For messaging.operation.name there currently would be 22 different values, just for Azure and GCP Pub/Sub alone:

send, schedule, cancel_scheduled, create_transaction, commit_transaction, rollback_transaction, receive, peek, receive_deferred, renew_message_lock, abandon, complete, defer, dead_letter, accept_session, get_session_state, set_session_state, renew_session_lock, ack, nack, subscribe, modack

I'd be curious how others think about this, but for me that's a bit overwhelming without a categorization mechanism like the operation type provides it.

If folks think messaging.operation.type shouldn't be required (or be conditionally required only if messaging.operation.name isn't given), then I think we should remove it altogether and stick with one required attribute (messaging.operation.name), and systems can populate it how they want. It's in my opinion a better solution than having two attributes where each is conditionally required on the absence of the other.

@lmolkova
Copy link
Contributor Author

lmolkova commented Apr 24, 2024

Regarding use-cases, for example it lets you ask questions like "give me spans for all settlement operations that caused an error" in a non-system specific way.

Thanks, it makes total sense. So it's less of a span visualization problem and more a metrics (or custom queries) concern.

The concerns I still have:

  • do I need to get all settlement operations in a non-system specific way? E.g. the default dashboards would show throughput/P95 latency/error rate for operation broken down by operation name. If they break things down by operation type, then all of the non-classified operations would be collapsed
  • system-specific dashboards/queries should probably be fine without operation type

What I'm saying that the value operation.type provides does not seem to be essential, but most importantly, it can be added later at any point without breaking anything.

And I'm proposing to remove it altogether and make operation name required.

@pyohannes
Copy link
Contributor

And I'm proposing to remove it altogether and make operation name required.

That's fair, let's discuss it in the next workgroup meeting.

@joaopgrassi
Copy link
Member

I read everything and to me also makes sense that operation name is required. I like the idea of a high-level classification but I'm not sure if it really makes sense which probably is a good indication to leave it out for now.

@pyohannes
Copy link
Contributor

The PR for gen_ai.operation.name has another interesting approach that uses namespaced values for the operation name, thus combining a type with a name.

For messaging, this could result in values like e. g. settle.ack, settle.nack, or publish.submit.

@lmolkova
Copy link
Contributor Author

lmolkova commented May 2, 2024

Based on the SIG discussion, we're tentatively aligned on removing messaging.operation.type for the time being and can re-introduce it later.
But it's blocked on #937 since operation type is a part of metrics name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: V1 - Stable Semantics
Development

Successfully merging a pull request may close this issue.

4 participants