Fix fill Datadog trace meta from Trace baggage #8

dnlserrano · 2018-09-28T11:48:44Z

I noticed it while using it to send some metadata to Datadog! 🙊

Am I doing something wrong? Thanks in advance!

sourcelevel-bot · 2018-09-28T11:49:18Z

test/support/api_server_test.exs

+
+  describe "ApiServer.format/3" do
+    test "it fills in Datadog metadata using the trace baggage", %{trace: trace} do
+      span = trace.spans |> Enum.at(0)


Use a function call when a pipeline is only one function long

sourcelevel-bot · 2018-09-28T11:49:19Z

Ebert has finished reviewing this Pull Request and has found:

1 possible new issue (including those that may have been commented here).

You can see more details about this review at https://ebertapp.io/github/spandex-project/spandex_datadog/pulls/8.

GregMefford

Good find, thanks! I actually included the baggage field recently so we could begin to think about how to support it as a part of the OpenTracing spec, but as you found, it doesn’t do anything yet. Have you tried using this “for real” in an application yet? I’m curious what the use-case looks like at the application level. Like how are you putting things into the baggage and propagating it from one service to another?

dnlserrano · 2018-09-28T14:15:14Z

Hey @GregMefford! I tried using it only in the context of a single app, i.e., reporting to Datadog from an app, but not fully tracing it across services (still waiting on some internal work on other services to make sense of the x-datadog-* headers 😅) It looks something like the following:

Note: Taken from the Datadog Trace page.

So the data actually ends up being tied to a specific span, not all of the spans (which is understandable given the non-mandatory way of creating and sending multiple traces with the same trace_id separately). I guess we could eventually propagate it across services using something like x-datadog-baggage, but it would be very brittle since other libs would have to use this headers as well...

I tested it in a real system, locally but using a real Datadog account! The data showed up there. 😄

I guess you can test it out if you've got some Datadog account, or I can too if you share it with me over Slack and I send an arbitrary key/value pair if you want to be sure (understandable).

The way I'm doing it is I'm grabbing the trace, changing the baggage, and updating the trace to use this new (updated) one (as per usual in the spandex core lib). I then finish the trace, and bam! Got my metadata in Datadog! ✨ 🐕

I think we should definitely add this in even if for now it only works for Spandex-emitted traces (i.e., traces sent from Elixir apps using spandex and spandex_datadog).

WDYT? ❤️

GregMefford · 2018-09-29T17:08:56Z

I definitely agree that we should support this feature, I just want to be sure that it's clear what it's for and the tradeoffs that come with it. My understanding is that it should be used pretty sparingly because you don't often need to pass a lot of metadata between services (just directly to the back-end trace collector) and it adds that much more overhead to the HTTP call.

Here are some docs about it: http://opentracing.io/documentation/pages/instrumentation/common-use-cases.html#using-baggage--distributed-context-propagation

zachdaniel · 2018-09-29T17:11:46Z

If you are only trying to support sending arbitrary data, I think that is supported already via the 'meta' key

GregMefford · 2018-09-29T17:16:26Z

Oh right, I meant to mention that but got side-tracked and forgot. I think you'd set something in the tags option on the span if you just want to include some metadata:

https://github.com/spandex-project/spandex/blob/master/lib/span.ex#L43

Tracer.span("some_span_name", tags: [key: "value"]) do
  # ...
end

dnlserrano · 2018-09-29T18:59:24Z

Hey guys! In hindsight, I agree with you both. I should probably have used the concept of Span tags here, indeed. 🙈

We should probably look into other (official) Datadog client libraries to understand how they’re propagating the baggage kv pairs and only then think about adding this feature end-to-end.

Shall I close this PR? 😅

GregMefford · 2018-09-29T20:30:06Z

No, I think it's worth leaving it open and getting it figured out properly, but perhaps not as urgent as you initially thought.

dnlserrano · 2018-10-06T17:23:31Z

Is there a standard way of propagating baggage across services in OpenTracing? I can't find information about the "correct way" of doing it in the OT specification.

I've seen it implemented in:

the Node Jaeger client
- comma-separated values sent in a single header (jaeger-baggage)
the Python Datadog client
- multiple headers (one for each key/value pair in baggage) with same header prefix (ot-baggage-)

I honestly prefer the way Jaeger does it, but since we're talking Datadog here, maybe we should just go ahead and follow their practice of using multiple headers, as they do in the Python lib.

zachdaniel · 2018-10-06T23:07:18Z

I don't know of any set down method, but I'd have to do some more research. Might be easier just to make it a configuration where they provide an mfa, and we can give them an mfa that will do either of those, and they could write their own if they wanted.

dnlserrano · 2018-10-06T23:38:43Z

Sounds good. I’m a strong advocate of sane defaults though.

My rationale is twofold:

Datadog has defined a way of doing it in one of their official clients (Python)
Distributed tracing across different services means we’ll have different programming languages talking the same open tracing language; we could contribute to consistency (following the principle of least astonishment)
- e.g., Python lib must be extracting baggage from HTTP headers in the inverse way they encode baggage in HTTP headers; having to prepare it to every possible client implementation would be very cumbersome (undoable, I’d even argue?)

But again, I might not be seeing the full picture!

zachdaniel · 2018-10-06T23:42:06Z

I agree completely. I would personally be fine with either, and we should definitely provide a default, but knowing that implementations differ we should make sure its easy to plug and play, and make that clear in the documentation. I think that, if we have to choose, we may want to choose the one that is more cumbersome, which would be the csv version (specifically because commas have to be escaped). But honestly I don't mind either way, as long as its pluggable and that is visible in the docs.

dnlserrano · 2018-10-06T23:59:15Z

Totally agree with allowing that sort of flexibility.

I don’t mean to sound pushy, but I’d only vote csv if we were having this discussion in spandex_jaeger.

Datadog clients for Python and Java (just confirmed) both use the multiple headers strategy. I think this goes to show the default in spandex_datadog should probably be the same as what other Datadog clients already do.

Thoughts?

Also hoping to hear from @GregMefford on this.

GregMefford · 2018-10-07T18:57:43Z

I think that spandex_datadog should encode it the way the Datadog-provided libraries for other languages do it, so that if you're "a Datadog shop," your tooling will all work seamlessly. When someone gets around to supporting Jaeger, they'd write the CSV version in their adapter. That's the beauty of the way Spandex has an internal representation and a vendor-specific back-end. Spandex itself just maintains a key-value data structure and the back-end library is responsible for injecting/extracting it the way that vendor wants to do it.

As far as vendor-interoperability, that's just a sad fact right now in the industry that Spandex can't solve, other than supporting multiple back-ends and perhaps having a per-endpoint configuration for outgoing requests (i.e. so that you could receive a trace using one format and propagate it out using a different one). I think other tracing libraries would similarly "just break" if you try to plug them into each other today.

FYI, there is some ongoing work in the industry (linked below) to solve this problem because even the vendors can see that this is the wrong place to differentiate - it's better to interoperate seamlessly on the collection side and differentiate on the visualization side.

https://w3c.github.io/distributed-tracing/report-trace-context.html
https://w3c.github.io/distributed-tracing/report-correlation-context.html

Fix fill Datadog trace meta from Trace baggage

ec7e980

sourcelevel-bot bot reviewed Sep 28, 2018

View reviewed changes

GregMefford reviewed Sep 28, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix fill Datadog trace meta from Trace baggage #8

Fix fill Datadog trace meta from Trace baggage #8

dnlserrano commented Sep 28, 2018

sourcelevel-bot bot Sep 28, 2018

sourcelevel-bot bot commented Sep 28, 2018

GregMefford left a comment

dnlserrano commented Sep 28, 2018

GregMefford commented Sep 29, 2018

zachdaniel commented Sep 29, 2018

GregMefford commented Sep 29, 2018

dnlserrano commented Sep 29, 2018 •

edited

Loading

GregMefford commented Sep 29, 2018

dnlserrano commented Oct 6, 2018

zachdaniel commented Oct 6, 2018

dnlserrano commented Oct 6, 2018

zachdaniel commented Oct 6, 2018

dnlserrano commented Oct 6, 2018

GregMefford commented Oct 7, 2018

Fix fill Datadog trace meta from Trace baggage #8

Are you sure you want to change the base?

Fix fill Datadog trace meta from Trace baggage #8

Conversation

dnlserrano commented Sep 28, 2018

sourcelevel-bot bot Sep 28, 2018

Choose a reason for hiding this comment

sourcelevel-bot bot commented Sep 28, 2018

GregMefford left a comment

Choose a reason for hiding this comment

dnlserrano commented Sep 28, 2018

GregMefford commented Sep 29, 2018

zachdaniel commented Sep 29, 2018

GregMefford commented Sep 29, 2018

dnlserrano commented Sep 29, 2018 • edited Loading

GregMefford commented Sep 29, 2018

dnlserrano commented Oct 6, 2018

zachdaniel commented Oct 6, 2018

dnlserrano commented Oct 6, 2018

zachdaniel commented Oct 6, 2018

dnlserrano commented Oct 6, 2018

GregMefford commented Oct 7, 2018

dnlserrano commented Sep 29, 2018 •

edited

Loading