Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OTEP-0220 Span structure for messaging scenarios into the messaging conventions #198

Closed
joaopgrassi opened this issue Jul 19, 2023 · 5 comments · Fixed by #284
Closed
Assignees

Comments

@joaopgrassi
Copy link
Member

The OTEP 220 which the messaging working group was focused on for the past months/year has been merged. We need to now integrate the text from there into the messaging specification.

Question still to answer: Do we want to integrate it with a "big bang" (bring all text in a single PR) or split it in portions so we send multiple, smaller PRs?

cc @lmolkova @pyohannes

@lmolkova
Copy link
Contributor

Discussed at Messaging SIG on 7/20 - the suggestion is to break it down into multiple PRs.

Here's a proposal on how to split:

  1. Update receive operation:
    • update the definition
    • clarify links usage (SHOULD instead of MAY), describe temporary hack to postpone span creation
    • introduce link attributes
    • incorporate existing examples in the spec with ones in the OTEP, preserving link attributes
  2. Introduce deliver operation:
    • define it
    • change process to deliver when appropriate
    • callout difference between processing and delivery and clarify that processing is an application concern
    • remove formal process operation
    • incorporate existing examples in the spec with ones in the OTEP, preserving link attributes
  3. Introduce create operation
    • define it
    • describe batch-publishing
  4. Introduce settle operation - Messaging: how to trace settle operation #1162
  5. End-to-end examples and cleanup:
    • bring other examples from the OTEP
    • check that specific system, FaaS, and AWS Lambda examples are up to date

Each PR should add corresponding examples from the OTEP when possible.

@pyohannes
Copy link
Contributor

Thanks @lmolkova, that looks good to me.

Some questions we can maybe discuss in today's workgroup call:

  • clarify links usage (SHOULD instead of MAY), describe temporary hack to postpone span creation

I would hold off with describing the hack, although that would be user-friendly. Firstly, it doesn't look that good and might not put a good light on the reputation of the project. Secondly, there's a danger that people take it as an argument against adding support for adding links after span creation time (as there would be a documented alternative solution).

Each PR should add corresponding examples from the OTEP when possible.

I'd also like to discuss how we proceed with examples. I'd rather minimize the use of examples, maybe we can defer to the OTEP? Initially they were very important, as they the conventions itself didn't contain much information about span structure. However, with the changes we're about to make here, the span structure should be apparent from the conventions itself, and examples should be less important.

@joaopgrassi
Copy link
Member Author

I agree about the hack part but have some opposite view about the examples. I think they are helpful and complimentary with the text and I'd love to keep them also in the spec.

@pyohannes
Copy link
Contributor

I'd also like to discuss how we proceed with examples.

From the WG discussion:

  • Put examples in a separate file, apart from messaging tracing conventions.
  • Replace ASCII art with mermaid diagrams.
  • Add additional examples to cover more scenarios.

@andrejpk
Copy link

Looks great! I'm going to chew on this a bit more but a few quick thoughts:

  1. I like and understand the deliver vs create operation concept but on my first scan of the diagrams, deliver was confusing because it was in context of the receiving system. receive may be more obvious in context of the consumer (but that's probably an overloaded term so another choice might be better)
  2. When tuning traces on a recent project, it was extremely helpful to have parent map with causality and links for added context. There's a light suggestion to do so here but the examples should default to that.
  3. In diagram "A producer creates and publishes a single message, it is delivered as part of a batch of messages to a consumer", it looks like the traces all fan back through that deliver operation. I guess that's the best we can do if that whole batch is sent w one context but for an ops person trying to measure performance or track a problem, this fan-in is a major stumbling block. At the app level, it may be necessary to embed per-message tracing data to restore that 1:1 link over the batching operation. But if an instrumented library is handling that batching, then it would be a nice recommendation to parent Process m1 to Publish m1` so that doesn't fall on the app developer.
  4. I agree with @joaopgrassi -- the examples are very helpful. I suggest they are numbered if kept in the docs for easy linking and reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: V1 - Stable Semantics
5 participants