Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

How to record casual relationships / sequencing between sibling spans #142

Open
yurishkuro opened this issue Dec 2, 2015 · 15 comments
Open

Comments

@yurishkuro
Copy link
Member

EDIT: decided below to re-focus this issue on the sequencing of sibling spans. Original title was "Provide example of reporting "middleware" hops"

Suppose we have an RPC call from Service A to Service B. In classic Zipkin service A starts a "client" span, and service B joins that span as "server". This results in a single span in the storage demarcated by cs->sr->ss->cr anotations. The new opentracing API advocates using different spans for client and server, but that's besides the point.

The question is what happens if there is some middleware between A and B that can also enrich the trace (for example, haproxy, or Hyperbahn). There may also be more than one hop through the middleware until the request reaches service B. There are two ways to represent this in the span-based tracing model

Nested spans

 client span (Service A)
+----------------------------------------------------------------------------+

    +---------------------------------------------------------------------+
     hop 1      
               +----------------------------------------------------------+
                hop 2
                          +-------------------+
                           server  (Service B)

Issue 1: when building service dependency graph, this trace will produce a dependency A->MW->MW->B. If there are many dependencies like this, the diagram will look like everything depends on MW, and MW talks to everything, but the A->B dependency is lost.

Possible solution: mark the "hop" spans with a special attribute indicating middleware, and handle them specially when building dependency diagram.

Issue 2: if the middleware is implemented as a proxy, it makes sense that a "hop" span does not complete until the server span is complete. However, if the middleware is implemented as a messaging system, the above trace does not make sense, it should look like below.

Stacked sibling spans

 client span (Service A)
+----------------------------------------------------------------------------+
    +------+   +------+   +-------------------+ +----------+ +------------+
     hop 1      hop 2      server                back-hop-2   back-hop-1

Issue: in order to display the trace as shown above, especially in light of clock skews, the UI needs to know that there is a strong happened-before relationships between spans. The current DCP API does not capture that relationship, and it's not clear if it can be captured via span annotations since each stacked span knows nothing about its siblings. In contrast, X-Trace API explicitly captured these relationships by means of using pushDown and pushNext operations.

@yurishkuro
Copy link
Member Author

@bensigelman @adriancole

@bhs
Copy link
Contributor

bhs commented Dec 3, 2015

Idea A

How about a HappensAfterSpan and/or HappensBeforeSpan tag? The values would be span_ids or similar. To be sufficiently general, that would imply that span tags are 1:many rather than 1:1 OR that we would use span log payloads to express these relationships.

If we reference the X-Trace paper:

image

or, not as an image:

next.parentID ⇐ current.opID
next.opID ⇐ unique()
next.type ⇐ NEXT

Wherever the pushNext would have happened in the X-Trace world, we instead add a HappensBeforeSpan annotation to current.

Idea B

We could instead follow the X-Trace model more directly and create something like a ParentType span annotation that could be set to NEXT instead of the default, DOWN. Something like that.

Thoughts?

@codefromthecrypt
Copy link

Meta-comment: on RFC's let's use real things!

Ex. we learned from zipkin that people rarely understood anything. How about revamping your example to use things tons of people understand.

Ex. instead of client-span -> hop1 -> hop2 -> server

browser -> elastic load balancer -> ha proxy -> tomcat

you can then refer to these in your example. It will help, as you can establish common ground with people who are not used to Span jargon, yet :) Ex. X-Forwarded-For` can help guide discussion.

One thing I learned in zipkin is that few know how social company RPC stuff, like finagle or autobahn work, so using these as examples, actually create cognitive distance rather than shortening it. Favoring the "EC2 crowd" will lower the barrier to entry in discussions like this from folks who already know zipkin etc to a wider amount of those who were left behind.

/me ends meta comment

for real comment, too distracted to think about a solution deeply right now, except this looks like a quite valid concern.

@yurishkuro
Copy link
Member Author

@bensigelman
Was thinking a lot about it lately. The x-trace's pushDown/pushNext (I assume we're talking about this paper) doesn't make sense to me as it was presented, e.g. in Fig.2 they show top-left node doing both pushDown and pushNext, as if it's doing two different transmissions or encodes two different trace contexts in the single transmission, one per logical layer.

How about a HappensAfterSpan and/or HappensBeforeSpan tag? The values would be span_ids or similar.

I think something like this is doable, although the first span that should be tagged with HappensBeforeSpan cannot know the next span ID. But it does know that it's fully finished before the next span starts. So it can emit an equivalent of pushNext annotation (or "finish-to-start" for people familiar with MS Project dependencies), and preserve its own span ID in the trace context so that the next sibling can can emit HappensAfterSpan=sid, yet still register itself as a child of the original parent span.

So the remaining question is how we want to capture this in the API.

@bhs
Copy link
Contributor

bhs commented Dec 25, 2015

Re the X-Trace paper: my understanding was that the top-left operation would record just the one piece of metadata, and that the pushNext and pushDown therein would be considered their own "operations" with their own context to log. I.e., the number of contexts in a dapper/zipkin model is not 1:1 with the number of contexts in an equivalent X-Trace model.

In any case, the most important question is the one you end with: how best to represent this in the programmer-facing API? The safest thing (IMO) would be to start with a lower-level API and leave it at that until we have greater evidence around the particular data model. By "lower-level," I mean just some simple function calls that abstract away the particular names we choose for "Happens-Before" tag keys, etc.

While I consider this topic an important one in the long-term, I don't want it to stumble into a lot of complexity that distracts us from the more pressing matter of getting publishable APIs out in go+py+js+java (or whatever else we decide to priority early on). Thoughts?

@codefromthecrypt
Copy link

Some background on zipkin.

This scenario is supported by the shared span model. Ex. in zipkin, multiple endpoints participate in the same span. This allows you to see the server and client on the same line. This also allows you to see any proxies in the same line.

Here's an example:

span [
{time 0, "cs", source},
{time 1, "firewall applied", proxy},
{time 2, "sr", destination},
...
]

A decision to squash proxies is highly subjective aka policy. A presentation layer could be taught to collapse proxies with the same span id via some policy? In zipkin, the "real" destination is annotated as a tag "sa". Using this, you could implement a policy to squash hops between the client and the server. Would something like this not work?

On the happens-before question (relating to clock skew), seems a separate albeit related issue.

@codefromthecrypt
Copy link

Also, assuming we aren't doing shared spans (which is ok by me), we could still make a type for proxies similar to this. That also would allow presentation tier to choose to squash them without larger model changes.. thoughts? https://cloud.google.com/cloud-trace/api/reference/rest/v1/projects.traces#SpanKind

@bhs
Copy link
Contributor

bhs commented Dec 30, 2015

@adriancole I agree completely that RPCs can and probably should be rendered as a single row in a conventional zipkin/dapper-style UI... yet from a data modeling perspective there is still a strong case to be made for multiple spans per RPC. And the SpanKind concept could work well.

PS: I don't think (?) the dapper paper addressed this, but in older versions of stubby (google's RPC subsystem) there were sometimes user-space queuing issues in high-throughput processes... as such, the trace UI showed server time, the full end-to-end client time, but also the queueing delay on both client and server sides which were sometimes significant in terms of the global critical path. I would suggest we model things like those enqueue/dequeue events via Logs in the opentracing data model.

@yurishkuro
Copy link
Member Author

@adriancole capturing firewall hop as a Log in the shared span doesn't seem useful due to the clock skew. The only thing it tells is "yep, we pass through the firewall", as the timestamp cannot be reasoned about without a lot of additional alignment logic. We (at Uber) decided to model proxy/router hops, such as haproxy, as nested spans (per initial post). It makes Dependencies graph job a bit harder, but not impossible since it just needs to know that "haproxy" service is a middleware and treat it as pass-through for the purpose of service-to-service dependency derivation. I haven't got around to implementing it yet, there will be a patch to the zipkin-dependencies.

Agree on the SpanKind, I've already ran into needing it when using OpenTracing API.

I suggest we keep this issue open until we have a good proposal for the happens-before use case, it's the one I primarily had in mind. I think we have a general idea, just need to come up with a concrete proposal. I agree with @bensigelman that it's not a very pressing issue.

@codefromthecrypt
Copy link

codefromthecrypt commented Dec 30, 2015 via email

@codefromthecrypt
Copy link

This topic of happens-before is one that circles quite often, and usually
Yuri makes comments.

If we are to re-purpose this issue to solve that, we'd be best using the
context we've collected here:

Scroll to Frame granularity and Sequencing (aka Local Spans)
https://docs.google.com/document/d/1ixxEs9TvhiGjJObGbRSPhSna3zHdadoUTQIZ5JKgLzU/edit#heading=h.wkls421pevch

@codefromthecrypt
Copy link

suppose another way to address this is to add a task list
https://github.com/blog/1375%0A-task-lists-in-gfm-issues-pulls-comments to
this issue with the dependencies before closing. Then, open top-level
issues relating to that checklistl.

Ex. we've at least sequencing, if not typing (SpanKind), right? Then, once
all the dependencies are solved, we can have a concrete answer to the hide
proxy thing (which I agree is useful), and also have transparency into what
we need to solve.

sg?

@yurishkuro yurishkuro changed the title Provide example of reporting "middleware" hops How to record casual relationships / sequencing between sibling spans Dec 30, 2015
@SwarnimRaj
Copy link

@yurishkuro Has this issue been fixed because I noticed that my middleware authentication and authorization spans seem to finish only at the end of a trace with subsequent spans visible as a subset though they are sibling and not child spans.

@yurishkuro
Copy link
Member Author

it has not been fixed. It should also be moved to the Specification repo.

@yurishkuro yurishkuro transferred this issue from opentracing/opentracing-go Jun 11, 2019
@richard-fine
Copy link

Joining this conversation from jaegertracing/jaeger-ui#390 - it looks like a solution is needed to express this kind of sequence/sibling relationship in order for visualizers like Jaeger-UI to reduce staircasing.

The discussion here is quite old - how much is still true, and what needs to happen next? @adriancole mentioned defining a to-do list with the dependencies, but what actually are those dependencies right now?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants