Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve semantic inconsistencies for non traditional messaging #1027
Resolve semantic inconsistencies for non traditional messaging #1027
Changes from 3 commits
d7ced57
c4b7094
697ce0a
362e0c1
f981ac0
a7272ce
9304911
aa657fe
2f442b3
497f8dc
d1143e8
ac664f6
a3d183b
fc461bc
1baeb11
46b1bda
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please clarify how this differs from the
peer.service
general span attribute?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression of
peer.service
is a service running on a specific peer, or host, of a network.In an environment like Kubernetes a single "peer", or node, can have many different services exposed on it. Meaning that sending a message to a Kafka service named "my-kafka" means more than a
peer.service
of "mycluster.kube", or some permutation thereof.It's possible I'm missing something, but that's my impression, which is why I felt a separate
message.service
was needed. Granted there might be situations where they are the same, but in Kubernetes that would be rareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would interpret
peer.service
to be the particular service this span is directed at, which also works if a given peer or host exposes multiple different services.@anuraaga do you think we could clarify this further in the spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arminru am I correct in saying your interpretation of
peer.service
is actually the name of the service?From my perspective, all the other
peer.*
ornet.peer.*
information relates to specific host or network node information, which is why I seepeer.service
as referring to a service on a specific host, as opposed to a virtual service name that could be clustered across many hosts/nodesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
net.*
is about lower-level networking information wherenet.peer.*
is about the host on the other end.peer.service
on the other hand is about the higher-level service. I don't think there's any otherpeer.*
attribute.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about that. The peer.service attribute is something that is expected to be manually configured by the user, not automatically determined by the instrumentation.
See https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/span-general.md#general-remote-service-attributes
CC @anuraaga who introduced peer.service in #652
Still it sounds like this concept is not specific to messaging, so maybe we should have peer.detected_service if peer.service is otherwise fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iNikem did you mean
messaging.service
duplicatespeer.service
or Resource'sservice.name
? Personally, it maybe overlaps with the latter and not the former.I don't see
peer.service
being a great name for what it is, but if that's the preferred approach I will removemessaging.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They actually belong together.
Given, for example, a client A sending a request to a server process B exposing a service called
X
:A creates a span for the request and sets
peer.service="X"
B, on the other end, sets
service.name="X"
on its resource, which will be added to a span it creates for tracking the processing of A's requestThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @arminru, it hasn't been clear where Resource fits, that helps.
I will drop
message.service
and usepeer.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Please add a note in the Kafka section saying it is recommended to set this and how to determine the appropriate value to make sure instrumentation writers are aware of it.
@anuraaga please verify if you think it's fine that instrumentations set this automatically (while still allowing users to override it, although I'm not really sure how this will work in practice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a tombstone filled in the instrumentation currently, is it useful to have here?
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/f23ad29187ecea4e742e1dd6a5baeb84020298d4/instrumentation/kafka-clients-0.11/src/main/java/io/opentelemetry/javaagent/instrumentation/kafkaclients/KafkaProducerTracer.java#L52
If not it's ok and we can remove it from the Instrumentation. But this seems like the right time to get out instrumentation synced up with the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might also consider separating out a subfolder for messaging implementations as per #968 - there are so many of them that I worry a single doc just gets unwieldy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problems with the delay, I appreciate everyone is super busy right now and context switching IS difficult.
I can see it being beneficial to identify if a particular trace is related to a tombstone record. I can add it to the Kafka section.
Can the splitting be done in a subsequent PR, or you'd prefer to include it here as well? I can certainly follow up with such a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting in a separate PR sounds great, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proc1's parent would be Rcv1 in this case as it's Proc1's direct predecessor and Prod1 would be added as a link.
Same for Prod2 being a child of Proc1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand the reasoning for Proc1's parent being Rcv1, but Prod2 is not a child of Proc1.
The reasoning is that in Kafka the producing of a message is disconnected in time from when, or how often, that message is processed.
In this situation, Prod2 may be processed quickly, or it could be days/weeks/months before it's processed. From a span perspective, I don't think it should have a parent of Proc1 due to that timing disconnect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From your timing diagram and description above it looks like as part of processing of the first message, i.e., in Proc1, the second message would be produced, i.e., Prod2. Is this not the case? For me it looks like Prod2 would be a direct consequence of Proc1. Please note that parents don't have to necessarily enclose their children from a timing perspective but also allow for async relationships.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is performed as part of processing the first message. However, in reactive messaging and event-driven architectures there isn't always a "parent-child" connection, it's more like a correlation or link relationship. They may be "parent-child", but not always.
If it is added as a parent, and it hasn't had any further span, doesn't that leave the parent span/trace unclosed? Does that make sense for reporting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related #958 / CC @anuraaga wanted kinda the opposite there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Oberon00, I knew there was a better term but couldn't think of it. Reading the issue made me realize it was "follows-from".
So for reactive messaging and event-driven, producing a new message while processing an existing message is more a "follows-from" relationship as opposed to parent-child, which is why I had used Links