Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OpenTelemetry Bridge to support OTel Java API 0.11.0 #2054

Closed
wants to merge 2 commits into from

Conversation

tylerbenson
Copy link
Contributor

@tylerbenson tylerbenson commented Nov 5, 2020

See also #1605.

To enable:

System Property: -Ddd.integration.opentelemetry-beta.enabled=true
Environment Variable: DD_INTEGRATION_OPENTELEMETRY_BETA_ENABLED=true

Remove some of the interfaces from the wrapped objects to stop encouraging casting to our interfaces as a way to integrate. (We will need to expose better APIs in the future, depending on what we want to allow.)

Add instrumentation to sync our context with the new otel context. Due to implementation differences, there are still some edge cases where they can get out of sync. (See the test for a demonstration of this.)

(If you skip over the first commit where the classes get moved, it gives a nicer diff.)

@tylerbenson tylerbenson force-pushed the tyler/otel-update branch 2 times, most recently from 4b3b774 to fa5d213 Compare November 6, 2020 16:46
@tylerbenson tylerbenson marked this pull request as ready for review November 6, 2020 18:10
@tylerbenson tylerbenson requested a review from a team as a code owner November 6, 2020 18:10
@tylerbenson tylerbenson changed the title Update OpenTelemetry Bridge to latest version Update OpenTelemetry Bridge to support OTel Java API 0.10.0 Nov 6, 2020
dirName = 'test'
}
}
// Can't have a latest test until a new version is released.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why. Care to elaborate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our latestDepTest has an assertion to verify that the version it is testing isn't the same as what is verified in test to ensure we declare the wildcard properly. There is a new version out now (0.10.1), so I can fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that assert is counterproductive. I would have updated the gRPC instrumentation when I had time if it weren't for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I added that, we had cases where we thought we were testing something new but were not. The assertion was an attempt to prevent a regression. Seems silly to run the same tests twice.

@@ -73,6 +73,8 @@

DDId getSpanId();

boolean isRemote();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does isRemote mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#isremote

if the SpanContext was propagated from a remote parent

ie, if the context was the result of a distributed trace propagation. (extract call)

Copy link
Contributor

@randomanderson randomanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits but looks correct from a "upgrade to latest version" perspective. I assumed the previous version was correct

@Advice.OnMethodExit(suppress = Throwable.class)
public static void updateScope(
@Advice.Argument(0) Context context, @Advice.Return(readOnly = false) Scope scope) {
Span span = Span.fromContext(context);
Copy link
Contributor

@dougqh dougqh Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like some more explanation as to what is happening here.

I don't really understand the need for WrappedScope.
I'd expect the OTelScope to be a wrapper around an AgentScope -- and the fact that they are closed together suggests they have the same lifecycle.

}

public AgentSpan.Context toAgentSpanContext(final SpanContext spanContext) {
// Currently assuming the span context was created above so it should contain an existing
Copy link
Contributor

@dougqh dougqh Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is public, this is worrying assumption. Should this method just be private/package visible instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make it package visible, but I don't think that will avoid the problem. Someone can still call SpanContext.create and pass it into SpanBuilder.setParent and it will result in this problem. Fixing this would require significant changes to the core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please describe the ramifications of calling SpanContext.create and SpanBuilder.setParent?
In general, I don't want to assume that significant changes to the core aren't necessary without fully understanding the problem first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated to remove this assumption.


@Override
public Span recordException(Throwable throwable, Attributes attributes) {
delegate.addThrowable(throwable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this attach the Attributes as well? Or are those Attributes of the Throwable?
If the later, a comment would help here.

Copy link
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few concerns mostly around WrappedScope & performance.

public static void beforeBuild(@Advice.This OpenTelemetry.Builder builder) {
ContextStore<SpanContext, AgentSpan.Context> spanContextStore =
InstrumentationContext.get(SpanContext.class, AgentSpan.Context.class);
builder.setTracerProvider(new OtelTracerProvider(spanContextStore));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How deeply does OTelTracerProvider & OTelContextPropagators get checked by muzzle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see both classes enumerated in the printReferences which suggests they are covered sufficiently. Is there something specific you're looking for?

Copy link
Contributor

@dougqh dougqh Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to know how deeply we check that OTelTracerProvider and OtelContextPropagators fit the latest OTel API. For instance, do we check that OTelTracerProvider implements all the necessary methods.

Otherwise, I think this has holes where we'll activate when we shouldn't. I think that is what happened recently with servlet-3.1. Mostly, I'm just curious, and I'm fine with addressing that issue in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Muzzle is validating those classes right now, but we have known issues with the depth that muzzle validates. I don't think that is specific to this PR. I suggest we schedule a separate effort for improving muzzle.

}

public AgentSpan.Context toAgentSpanContext(final SpanContext spanContext) {
// Currently assuming the span context was created above so it should contain an existing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please describe the ramifications of calling SpanContext.create and SpanBuilder.setParent?
In general, I don't want to assume that significant changes to the core aren't necessary without fully understanding the problem first.

@dougqh
Copy link
Contributor

dougqh commented Nov 13, 2020

The more I think about this -- the more I think this isn't quite the right approach.

I think if we're going to have a bridge for OTel. We should have a package similar to ddtrace-ot.
Such a ddtrace-otel should have minimal dependencies and shouldn't require any instrumentation.

I think the instrumentation based solution is good for those using the full agent, but using the full agent should be optional.
Since I think we want both, I think large parts of this PR are fine, but I think we need to rethink the adapters.

In my opinion, the adapters should not and cannot rely on instrumentation.
I think that's for the better anyway. As is, the instrumentation tries to keep two data structures in sync, I think that approach is fraught with peril and best avoided.

I think that means we'll have to push OpenTel's more generic Context concept into our core.
While it would be nice if this weren't necessary, I think it is largely inevitable that our core will need to support a union of the capabilities of the APIs that we support externally.

Added an optional context field onto our Scope seems fairly straight-forwarded. Although, I'd prefer that we don't actually have our Scope depend on OTel directly. The tricky part appears to be deciding how Context is propagated.

I think for the bridge alone we can wait on deciding, since OTel users are expected to propagate Context explicitly. However when integrated with our full agent, we'll need to think about how we lazily materialize a Context when the Scope is created by our instrumentation or through our OpenTracing bridge.

@tylerbenson
Copy link
Contributor Author

I'm going to put this back in draft until we make progress on the context issue.

@tylerbenson tylerbenson marked this pull request as draft November 19, 2020 17:59
@tylerbenson tylerbenson changed the base branch from master to tyler/muzzle-super December 2, 2020 21:00
@tylerbenson tylerbenson changed the title Update OpenTelemetry Bridge to support OTel Java API 0.10.0 Update OpenTelemetry Bridge to support OTel Java API 0.11.0 Dec 2, 2020
Base automatically changed from tyler/muzzle-super to master December 3, 2020 21:05
Remove some of the interfaces from the wrapped objects to stop encouraging casting to our interfaces as a way to integrate.
Add instrumentation to sync our context with the new otel context.  Due to implementation differences, there are still some edge cases where they can get out of sync.
@tylerbenson tylerbenson deleted the tyler/otel-update branch May 25, 2021 14:36
@tylerbenson
Copy link
Contributor Author

Going to close this until we get around to the prerequisite scope manager changes.

@dajulia3
Copy link

I'm considering using datadog vs splunk as an opentelemetry endpoint. I'm curious to know-- is this actively being worked on? I'd love to be able to use my existing opentelemetry instrumentation with DataDog (without needing to run a separate opentelemetry collector).

@tylerbenson
Copy link
Contributor Author

@dajulia3 Would you mind filing a support ticket (support@datadoghq.com) so we can get more details on your use case and figure out a good path forward?

Thanks!

@dajulia3
Copy link

dajulia3 commented Jun 3, 2021

@tylerbenson sure thing!

@reify-marcio-faria
Copy link

Hey @tylerbenson, as the Opentracing library is deprecated, what are the plans to support, so that we can use custom instrumentation with all other goodness of datadog continuous profiling and the like?

There is also this related issue

@bantonsson
Copy link
Contributor

@reify-marcio-faria Adding support for OpenTelemetry is definitely on our roadmap, but I can't give you any solid dates. In the meantime it is still possible to use the OpenTracing API for custom instrumentation, just because it's deprecated doesn't mean that the artifacts will disappear from maven central, and if you switch to OpenTelemetry later, there is a compatibility layer available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants