Support for tracing asynchronous operations #124

carterkozak · 2019-04-01T15:10:48Z

Added a Tracer API to create a span that is not attached to a
thread.

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java

tracing/src/main/java/com/palantir/tracing/SpanToken.java

carterkozak · 2019-04-01T15:16:05Z

tracing/src/main/java/com/palantir/tracing/Tracer.java

+    /**
+     * Like {@link #startSpan(String, SpanType)}, but does not set or modify tracing thread state.
+     */
+    public static DetachedSpan startDetachedSpan(String operation, SpanType type) {


This might be clearer if we call it startAsyncSpan instead.

However it's not necessarily asynchronous, it's no different from any other span except that it is not connected to any thread-local state.

tracing/src/main/java/com/palantir/tracing/Tracer.java

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java

markelliot · 2019-04-01T18:26:38Z

perhaps an alternative here would be exposing Trace + having Trace#createSpan exposed, and we could add a method copyOrCreateTrace which returns a new Trace captured from the thread state, but is otherwise 'detached' and still mutable in the way that's required by AsyncTrace

carterkozak · 2019-04-01T18:33:37Z

I'd prefer not to expose the Trace or OpenSpan objects directly because API consumers shouldn't need access to that internal data to instrument their code. I realize these are already exposed to some extent, but we can avoid further api/implementation coupling while providing both easier and safer utilities.

markelliot · 2019-04-01T22:15:53Z

DetachedSpan seems like as odd a thing to expose as Trace or OpenSpan. I guess the thing I'm poking at is that in cases where we want traces to be detached from threads, we'd logically be implementing exactly the things that we attach to threads. I'd rather we expose the basics than invent other concepts More generally, what seems appropriate for API surface here seems to follow a pretty arbitrary set of rules -- the initial intent was something extremely limited, and the drift that's grown with the library (from having basically three methods) now makes it hard to hold a reasonable bar.

carterkozak · 2019-04-02T17:39:14Z

Agreed that we're in an odd place, and there's not a lot we can do without either breaking API or creating sprawl. The Trace object doesn't work well across threads, it's meant to represent a single threads state -- at it's core it's a tuple of traceId, sampled, stack<open-span>. The stack maps directly to the call stack, copying this stack across threads destroys this mapping and makes it easy to shoot ourselves in the foot.

If we wanted to focus on safety, the Trace object would take the form {traceId, sampled, currentSpan}, and tracing operations would be encapsulated in a runnable e.g. Tracer.startSpan("operation", () -> myTracedOperation); where the Trace stack is replaced by span invocations on the application stack. In this model, the proposed DefaultDetachedSpan would be unnecessary, and the Trace object could make sense. Unfortunately we have taken advantage of too many sharp edges of the current Trace object (e.g. AsyncTracer) to make this change without a major rev, which is likely more trouble than it's worth at the moment.

blab: I don't have a great solution.

ellisjoe · 2019-06-20T14:03:26Z

Hey, just jumping in on this now since I'm going to need something like this for dialogue. Has any more thought gone into this issue, or are the comments here up to date and we just need to come to an agreement on some proposal?

carterkozak · 2019-06-20T14:39:46Z

Hey @ellisjoe, a month or so ago I caught up with @markelliot and we discussed a couple models. Mark was planning to put together a sample proposal/implementation that included a concept of span ownership, but he may have been pulled into other things. I'd be happy to chat through possibilities.

ellisjoe · 2019-06-20T16:55:19Z

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java

+     * as the parent instead of thread state.
+     */
+    @MustBeClosed
+    SpanToken attach(String operationName, SpanType type);


What's the difference between calling detach(operation) and then complete() vs attach(operation) and then close()? is it just that calling attach also overrides my current thread local state?

attach applies a new span to the current thread, from attach() to close() This can be done multiple times to create sibling spans with the DetachedSpan as a parent.

DetachedSpan.complete, is used to complete the detached span, which is not associated with a thread. There is danger that complete is never called, and we fail to associate the child spans (from attach) with the original parent of this detached span, but I don't think there are good options to avoid that.

This model means that you cannot ever detach or attach an existing span that is already bound to a thread, each time that we attach or detach we mark a new operation. I've found that otherwise, it can become difficult to trace where our tracing spans have actually come from.

Oh I see, so you'd start your DetachedSpan, then you could attach() it once your async task starts running, and then go back to using the normal Tracer.startSpan() methods throughout.

Think the thing that seems a bit funky is that you need to close the SpanToken rather than just using Tracer.completeSpan().

That is correct, the reason for the different type is so we can reset the original tracing data on the attached thread. We could ignore that, and apply state directly, but it would be confusing to api consumers whether or not the attach creates a new tracing span, and how many they are responsible for completing.

I think this may have been more obvious to me if the attach() method was on Tracers rather than here. In my head, at least at the moment, Tracers deals with all the threading business, and so I could imagine wanting to call Tracers.attach(detachedSpan) which would then give me a SpanToken to close and reset the thread's trace. And then on DetachedSpan you just would just have newSpan() which gives you a new DetachedSpan parented by the current one. That way DetachedSpan has nothing to do with threading.

👍 That sounds good to me.

As a point of reference, that also seems to be what these guys are doing as well: https://github.com/open-telemetry/opentelemetry-java/blob/master/api/src/main/java/io/opentelemetry/trace/Tracer.java#L140

Tracers.attach(detachedSpan) would also need an operation name param. It would also require us to pass some additional data between DetachedSpan and Tracer, which isn't currently possible. Right now, the DetachedSpan interface doesn't leak whether or not it is sampled, but that is required in order to attach to a thread.

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java

carterkozak · 2019-06-24T16:17:53Z

Holding off fixing this up until #180 has gone in to unblock other work

carterkozak · 2019-06-24T20:30:30Z

Rebased, have to catch a flight shortly, will knock out the rest when I can.

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java

carterkozak · 2019-08-21T19:24:09Z

I've updated the API, there's still some implementation code that I need to move around. I don't like having the DetachedSpan implementations living in Tracer, but I can clean that up once we're happy with the ergonomics.

tracing/src/main/java/com/palantir/tracing/Tracer.java

ferozco · 2019-08-23T20:23:02Z

tracing/src/main/java/com/palantir/tracing/Tracer.java

+        return Optional.empty();
+    }
+
+    private static final class SampledDetachedSpan implements DetachedSpan {


Could we convert DetachedSpan into an abstract class, and inline Smapled/UnsampledDetachedSpan there? It would make the layout consistent with how Trace is currently implemented

I could go either way on this. the DetachedSpan interface is meant to be relatively straightforward to read, and detached from implementation details because it's an API that will be consumed by developers using this library. Trace is an implementation detail, and isn't really consumed outside of this library.

If we moved these to the DetachedSpan class, we would also need to implement a package private accessor for the raw value of the Tracer trace thread-local, which could be dangerous if folks try to get clever.

ferozco · 2019-08-23T20:28:07Z

tracing/src/main/java/com/palantir/tracing/Tracer.java

+        }
+
+        @Override
+        public DetachedSpan detach(String operation, SpanType type) {


What use case do you envision for this?

Consider a webserver: When a request is received on a non-blocking thread, I'd like to start a detached span representing the entire request. Later, before processing, the request is enqueued to a larger thread pool where we allow work -- I'd like the ability to track time spent enqueued, despite not having attached the span to thread.

…acing_api

iamdanfox · 2019-09-03T10:25:10Z

Unblocking 👍 I think the next steps are to publish an RC from here and then make sure it looks sensible in conjure-java-runtime.

iamdanfox · 2019-09-03T16:22:47Z

Dan and I are trying this out on conjure-java-runtime's https://github.com/palantir/conjure-java-runtime/compare/ds/use-new-tracing?expand=1, findings so far:

~~we want to use CloseableSpan instead of the Tracer#startSpan to get the X-B3-SpanId etc headers onto the wire, but don't have an easy way to get the current header.~~ released as 3.2.0-rc3
maybe we should use the java assert keyword to make sure people wire up their instrumentation correctly in tests?

* Revert "CloseableSpan exposes ids necessary to set headers" This reverts commit 1f188ac. * Tracer#getTraceMetadata * parentSpanId is empty * nit * Update tracing/src/main/java/com/palantir/tracing/Tracer.java Co-Authored-By: Carter Kozak <ckozak@ckozak.net>

…acing_api

ferozco · 2019-09-04T18:57:08Z

👍

svc-autorelease · 2019-09-04T19:01:17Z

Released 3.2.0

carterkozak requested review from dansanduleac, robert3005, markelliot and ellisjoe April 1, 2019 15:10

carterkozak requested a review from a team as a code owner April 1, 2019 15:10

carterkozak commented Apr 1, 2019

View reviewed changes

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java Outdated Show resolved Hide resolved

tracing/src/main/java/com/palantir/tracing/SpanToken.java Outdated Show resolved Hide resolved

carterkozak commented Apr 1, 2019

View reviewed changes

carterkozak added the do not merge label Apr 1, 2019

carterkozak commented Apr 1, 2019

View reviewed changes

tracing/src/main/java/com/palantir/tracing/Tracer.java Outdated Show resolved Hide resolved

carterkozak requested review from esword and dsd987 April 1, 2019 15:32

carterkozak commented Apr 1, 2019

View reviewed changes

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java Show resolved Hide resolved

This was referenced Apr 18, 2019

[improvement] Correctly nest spans when tracing okhttp requests palantir/conjure-java-runtime#1069

Closed

Set parent spans sensibly when making requests palantir/conjure-java-runtime#1073

Closed

pkoenig10 mentioned this pull request May 1, 2019

[proposal] Allow creating deferred traces without spans #147

Closed

ellisjoe reviewed Jun 20, 2019

View reviewed changes

ellisjoe reviewed Jun 21, 2019

View reviewed changes

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java Show resolved Hide resolved

ellisjoe mentioned this pull request Jun 24, 2019

[WIP] Add client side tracing palantir/dialogue#123

Closed

carterkozak force-pushed the ckozak/detached_tracing_api branch from 35f344c to 1ea1675 Compare June 24, 2019 20:27

carterkozak force-pushed the ckozak/detached_tracing_api branch from e93254f to b45f25d Compare August 21, 2019 19:18

carterkozak commented Aug 21, 2019

View reviewed changes

tracing/src/main/java/com/palantir/tracing/DetachedSpan.java Outdated Show resolved Hide resolved

carterkozak added 4 commits August 23, 2019 12:29

s/p/t

a153b7b

Test coverage for existing state

a70ee0d

handle attached span parent id

a9c5f32

Optimize DetachedSpan unsampled path

b9b7d4a

carterkozak force-pushed the ckozak/detached_tracing_api branch from 4a1aae2 to b9b7d4a Compare August 23, 2019 17:05

UnsampledDetachedSpan.detach reuses spans

9a253e9

carterkozak removed the do not merge label Aug 23, 2019

changelog

b3bcd01

carterkozak changed the title ~~[proposal] Support for tracing asynchronous operations~~ Support for tracing asynchronous operations Aug 23, 2019

ferozco reviewed Aug 23, 2019

View reviewed changes

forozco and others added 4 commits August 29, 2019 11:32

Merge remote-tracking branch 'origin/develop' into ckozak/detached_tr…

837c9b9

…acing_api

test async tracing

d8c16ea

Merge remote-tracking branch 'origin/develop' into ckozak/detached_tr…

d074362

…acing_api

tweaks to ckozak's DetachedSpan & demo usage (#247)

bc20b95

javadoc fixes

4b9e17c

iamdanfox and others added 3 commits September 4, 2019 12:29

CloseableSpan exposes ids necessary to set headers

1f188ac

OkhttpTraceInterceptor2

54fb1bf

dansanduleac mentioned this pull request Sep 4, 2019

Completely revamp tracing spans palantir/conjure-java-runtime#1205

Merged

iamdanfox added 3 commits September 4, 2019 19:35

RenderTracingRule respects $CIRCLE_ARTIFACTS

a6f17ac

Deprecate AsyncTracer

99ee300

Merge remote-tracking branch 'origin/develop' into ckozak/detached_tr…

c436c32

…acing_api

iamdanfox added the autorelease label Sep 4, 2019

iamdanfox added the merge when ready label Sep 4, 2019

iamdanfox merged commit c8ade88 into develop Sep 4, 2019

iamdanfox deleted the ckozak/detached_tracing_api branch September 4, 2019 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for tracing asynchronous operations #124

Support for tracing asynchronous operations #124

carterkozak commented Apr 1, 2019

carterkozak Apr 1, 2019

carterkozak Apr 1, 2019

markelliot commented Apr 1, 2019

carterkozak commented Apr 1, 2019

markelliot commented Apr 1, 2019

carterkozak commented Apr 2, 2019

ellisjoe commented Jun 20, 2019

carterkozak commented Jun 20, 2019

ellisjoe Jun 20, 2019

carterkozak Jun 20, 2019

ellisjoe Jun 21, 2019

carterkozak Jun 21, 2019

ellisjoe Jun 21, 2019

carterkozak Jun 21, 2019

ellisjoe Jun 24, 2019

carterkozak Jun 24, 2019

carterkozak commented Jun 24, 2019

carterkozak commented Jun 24, 2019

carterkozak commented Aug 21, 2019

ferozco Aug 23, 2019

carterkozak Aug 23, 2019

ferozco Aug 23, 2019

carterkozak Aug 23, 2019

iamdanfox commented Sep 3, 2019 •

edited

Loading

iamdanfox commented Sep 3, 2019 •

edited

Loading

ferozco commented Sep 4, 2019

svc-autorelease commented Sep 4, 2019

Support for tracing asynchronous operations #124

Support for tracing asynchronous operations #124

Conversation

carterkozak commented Apr 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markelliot commented Apr 1, 2019

carterkozak commented Apr 1, 2019

markelliot commented Apr 1, 2019

carterkozak commented Apr 2, 2019

ellisjoe commented Jun 20, 2019

carterkozak commented Jun 20, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carterkozak commented Jun 24, 2019

carterkozak commented Jun 24, 2019

carterkozak commented Aug 21, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iamdanfox commented Sep 3, 2019 • edited Loading

iamdanfox commented Sep 3, 2019 • edited Loading

ferozco commented Sep 4, 2019

svc-autorelease commented Sep 4, 2019

iamdanfox commented Sep 3, 2019 •

edited

Loading

iamdanfox commented Sep 3, 2019 •

edited

Loading