Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netty instrumentation considered harmful #2496

Open
iNikem opened this issue Mar 4, 2021 · 1 comment
Open

Netty instrumentation considered harmful #2496

iNikem opened this issue Mar 4, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@iNikem
Copy link
Contributor

iNikem commented Mar 4, 2021

First, take a look at a pair of HttpClientRequestTracingHandler/HttpClientResponseTracingHandler from out Netty client instrumentation.

Second, think what will happen when the following code executes:

Channel ch = ...
try(Scope scope = startNewSpanAndScope()) {
  ch.writeAndFlush(...)
}
try(Scope scope = startNewSpanAndScope()) {
  ch.writeAndFlush(...)
}

We have two outgoing requests on the same channel (request pipelining). Each one of them will install its own context as channel attribute (see HttpClientRequestTracingHandler around line 38). The second one will overwrite the first one. Thus HttpClientResponseTracingHandler will try to close second span twice and the first span will remain opened.

Netty does not do any request-response correlation. This is left for a higher level API's or protocols implementations. Which forces me to think that we should drop Netty instrumentation altogether and replace it with library-specific higher level API instrumentation.

Yes, this seems like a huge amount of work. But think about this. The purpose of Netty is to be an async low-level library. This means that, most probably, higher level http clients which use Netty are also async. Which means that we anyway have to write an integration code for them to transfer span context from user code, which initiates the http request, all the way down to Netty channel. So the amount of work needed for properly supporting Netty-based http clients does not actually change.

@iNikem iNikem added the bug Something isn't working label Mar 4, 2021
@anuraaga
Copy link
Contributor

anuraaga commented Mar 5, 2021

I suggested a similar idea last week too after seeing your struggles :) I generally do agree with the idea of not enabling, or not enabling by default, Netty instrumentation - anecdotally while I'd generally use library instrumentation, if I was to use the agent I'd disable netty and concurrent since they're hard to reason about and I'd want to handle context propagation myself. I guess the issue is whether there are actually still use cases handled well by the instrumentation, e.g. if a user does directly use Netty in a synchronous way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants