Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing Context not logged when replacing ReactorNettyHttpTracing with standard HttpClient/HttpServer integration #2850

Closed
grassehh opened this issue Jul 8, 2023 · 28 comments
Assignees
Labels
for/stackoverflow Questions are best asked on SO or Gitter

Comments

@grassehh
Copy link

grassehh commented Jul 8, 2023

Describe the Bug

Spring Boot 2 with Spring Sleuth -> Spring Boot 3 with Micrometer migration
Hello, I'm aware that as explained in the Javadoc, ReactorNettyHttpTracing will be deprecated in next major version.
As a consequence, I am trying to find an alternative.
As explained here and on the official documentation, it is said that one need to use the built-in metrics of Netty Reactor.

However in a Spring Boot 3 environment with Micrometer, I am not sure of how to do this, but it seems that it is not working correctly.

Steps to Reproduce

  • Clone this project
  • Checkout issues/reactor-netty branch
  • Start the app and call the server as explained in the README

You will see that in the webClient log, there is no baggage logged in the MDC:

-MDC Context: [] - -2023-07-12T19:52:24.119+02:00 -TRACE 42957 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"896af375872f07c2","protocol":"HTTP/1.1","remote":"google.com/216.58.214.174:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64aee8571dbceb4af15d6f787e96756f-4bba3d9ce759afef-1"],"host":["google.com"],"mybaggagecontroller":["myBaggageControllerValue"],"mybaggagefilter":["myBaggageFilterValue"],"user-agent":["ReactorNetty/1.1.8"]}}

However if you edit ClientConfiguration.httpClientCustomizer method by replacing this code:

it.metrics(true) { uriTagValue: String ->
        uriTagValue //Not sure what to do here ?
}

with this code:

reactorNettyHttpTracing.decorateHttpClient(it)

and do the same test, you will see that all the baggages are now correctly logged (traceId, spanId, myBaggageFitler, myBaggageController):

-MDC Context: [traceId=64aee916ac416098ec3c881a8c596d93, spanId=ba1148154721de18, myBaggageController=myBaggageControllerValue, myBaggageFilter=myBaggageFilterValue] - -2023-07-12T19:55:34.836+02:00 -TRACE 43190 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"c16054ff4fc614e6","protocol":"HTTP/1.1","remote":"google.com/216.58.214.174:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64aee916ac416098ec3c881a8c596d93-7ce6c146db5e8b18-1"],"host":["google.com"],"mybaggagecontroller":["myBaggageControllerValue"],"mybaggagefilter":["myBaggageFilterValue"],"user-agent":["ReactorNetty/1.1.8"],"X-B3-ParentSpanId":["7ce6c146db5e8b18"],"X-B3-Sampled":["1"],"X-B3-SpanId":["ba1148154721de18"],"X-B3-TraceId":["64aee916ac416098ec3c881a8c596d93"]}}

Expected result

The baggages in tracing context (traceId, spanId and baggages in scope) should be logged in both Server/Client request/responses when using the built-in HttpClient.metricsand HttpService.metricsmethods instead of the deprecated ReactorNettyHttpTracing class.

Thanks for the help !

@grassehh grassehh added status/need-triage A new issue that still need to be evaluated as a whole type/bug A general bug labels Jul 8, 2023
@violetagg violetagg self-assigned this Jul 10, 2023
@violetagg violetagg removed the status/need-triage A new issue that still need to be evaluated as a whole label Jul 10, 2023
@violetagg
Copy link
Member

violetagg commented Jul 10, 2023

@grassehh Please provide reproducible example.

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Jul 10, 2023
@grassehh
Copy link
Author

@grassehh Please provide reproducible example.

Hello, I updated the whole description and provided a sample. Thanks.

@violetagg
Copy link
Member

violetagg commented Jul 13, 2023

@grassehh Can you try this?

@Configuration
class ClientConfiguration {
    @Bean
    fun httpClientCustomizer() = ReactorNettyHttpClientMapper {
       it.doOnRequest { request, connection ->
           val handler: LogbookClientHandler? = connection.channel().pipeline().get(LogbookClientHandler::class.java)
           if (handler != null) {
               connection.replaceHandler("LogbookClientHandler", TracingChannelDuplexHandler(handler))
           }
       }
    }

    @Bean
    fun webClientCustomizer(clientHttpConnector: ClientHttpConnector) = WebClientCustomizer {
        it.clientConnector(clientHttpConnector)
    }

    @Bean
    fun webClient(webClientBuilder: WebClient.Builder) = webClientBuilder.build()

    class TracingChannelDuplexHandler(private val delegate: ChannelDuplexHandler): ChannelDuplexHandler() {

        override fun channelRead(ctx: ChannelHandlerContext, msg: Any) {
            ContextSnapshot.setAllThreadLocalsFrom(ctx.channel()).use { scope ->
                delegate.channelRead(ctx, msg)
            }
        }

        override fun write(ctx: ChannelHandlerContext?, msg: Any?, promise: ChannelPromise?) {
            ContextSnapshot.setAllThreadLocalsFrom(ctx!!.channel()).use { scope ->
                delegate.write(ctx, msg, promise)
            }
        }
    }
}

This is described in the reference documentation in the section for custom ChannelHandlers
https://projectreactor.io/docs/netty/release/reference/index.html#_access_current_observation_4

@grassehh
Copy link
Author

grassehh commented Jul 13, 2023

Thanks for the answer. I confirm the solution works. I have updated the branch consequently. I have slightly simplified the client customizer by naming the method logbookNettyClientCustomizerso that it overrides Logbook's default customizer, without having to replace the handler.

However I have few more questions/remarks about it:

  • setAllThreadLocalsFrom method is deprecated so I used ContextSnapshoFactory.setThreadLocalsFrom instead
  • Do you think TracingChannelDuplexHandler could be something worth adding into Spring Boot 3, and the decoration of the logbook handlers into Logbook project ?
  • I tried using the same mecanism for the HttpServer, but it does not seem to work. Could it be an issue with the LogbookServerHandler?

The server customizer looks like this :

@Bean
fun logbookNettyServerCustomizer(logbook: Logbook, contextSnapshotFactory: ContextSnapshotFactory) =
    NettyServerCustomizer { server ->
        server.doOnConnection { connection: Connection ->
            connection.addHandlerLast(
                TracingChannelDuplexHandler(
                    LogbookServerHandler(logbook),
                    contextSnapshotFactory
                )
            )
        }
    }

and when calling the app, the request/response logs have empty MDC. Example of request log:

-MDC Context: [] - -2023-07-13T12:59:31.181+02:00 -TRACE 66724 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"remote","type":"request","correlation":"ad43f64b644c7839","protocol":"HTTP/1.1","remote":"/127.0.0.1:60065","method":"GET","uri":"http://localhost:8080/test","host":"localhost","path":"/test","scheme":"http","port":"8080","headers":{"Accept":["*/*"],"Authorization":["XXX"],"Host":["localhost:8080"],"User-Agent":["curl/7.88.1"]}}

Again, the sample has been updated you can check it out.

@violetagg
Copy link
Member

setAllThreadLocalsFrom method is deprecated so I used ContextSnapshoFactory.setThreadLocalsFrom instead

The ContextSnapshoFactory.setThreadLocalsFrom is newer functionality, if you use context propagation version that supports it then use what is recommended as replacement for setAllThreadLocalsFrom

Do you think TracingChannelDuplexHandler could be something worth adding into Spring Boot 3

Decorating Netty events with TracingChannelDuplexHandler may affect the performance, because of that in the example I implemented ONLY those that are needed by Logbook. Other solutions might need to implement other events, so IMO there isn't something by default that can be added to Spring Boot.

and the decoration of the logbook handlers into Logbook project ?

Context Propagation library is generic one, it does not depend on any specific project so it is up to Logbook project whether they want to use it.

I tried using the same mecanism for the HttpServer, but it does not seem to work. Could it be an issue with the LogbookServerHandler?

As you can see for the client we don't need anything specific from Reactor Netty, just a custom Netty ChannelHandler.
This is not the case for the server. The issue is the following:

request -> Reactor Netty -> Spring framework creates Observation

response -> Spring framework closes Observation -> Reactor Netty

As you can see everything that is NOT in the scope of the framework cannot see the Observation. This is the case with Logbook.

In this case you will need Reactor Netty <-> Micrometer integration.

I need to check for a fix in Spring and I'll try to come up with a customiser for the server.

@violetagg violetagg removed the for/user-attention This issue needs user attention (feedback, rework, etc...) label Jul 13, 2023
@grassehh
Copy link
Author

So if I understand correctly and sum up, this means that it requires Logbook to implement Reactor Netty <-> Micrometer integration in order to:

  • Decorate both LogbookClientHandler and LogbookServerHandler with a handler similar to you TracingChannelDuplexHandler
  • Be part of the observation scope

@violetagg
Copy link
Member

violetagg commented Jul 13, 2023

So if I understand correctly and sum up, this means that it requires Logbook to implement Reactor Netty <-> Micrometer integration in order to:

  • Decorate both LogbookClientHandler and LogbookServerHandler with a handler similar to you TracingChannelDuplexHandler
  • Be part of the observation scope

mmm not exactly ...
May be if Logbook wants to take part of the Spring Framework server instrumentation then it needs something else than Netty ChannelHandler ... I don't have deep knowledge in Logbook though ...

@violetagg
Copy link
Member

@grassehh This should be working for the server:

@Configuration
class ServerConfiguration {

    @Bean
    fun logbookNettyServerCustomizer(logbook: Logbook, contextSnapshotFactory: ContextSnapshotFactory) =
        NettyServerCustomizer { server ->
            server.doOnConnection { connection: Connection ->
                connection.addHandlerLast(
                    TracingChannelDuplexHandler(
                        LogbookServerHandler(logbook),
                        contextSnapshotFactory
                    )
                )
            }
                .metrics(true) { uriTagValue: String ->
                    uriTagValue
                }
        }


    @Configuration
    internal class Config {
        @Autowired
        var observationRegistry: ObservationRegistry? = null

        @PostConstruct
        fun setup() {
            ObservationThreadLocalAccessor.getInstance().observationRegistry = observationRegistry!!
            Metrics.observationRegistry(observationRegistry!!)
        }
    }
}

For the uriTagValue

.metrics(true) { uriTagValue: String ->
    uriTagValue
}

The idea is to provide a templating if possible e.g. if the uri is /user/id/123456 you can decide to template it like this /user/id/{n}. See for more information the reference documentation https://projectreactor.io/docs/netty/release/reference/index.html#_metrics_4

In order to avoid a memory and CPU overhead of the enabled metrics, it is important to convert the real URIs to templated
 URIs when possible. Without a conversion to a template-like form, each distinct URI leads to the creation of a distinct tag,
 which takes a lot of memory for the metrics.

@violetagg violetagg added for/stackoverflow Questions are best asked on SO or Gitter and removed type/bug A general bug labels Jul 14, 2023
@grassehh
Copy link
Author

grassehh commented Jul 17, 2023

I updated the branch with your suggested server implementation.
The traceId and spanId are now correctly logged in both server and client calls.
However the custom baggages myBaggageFilter and myBaggageController are not logged.
Interested fact is that commenting the reactor hook call Hooks.enableAutomaticContextPropagation() makes the custom baggages appear in the server logs, but disappear from the client logs.

@violetagg
Copy link
Member

I updated the branch with your suggested server implementation.
The traceId and spanId are now correctly logged in both server and client calls.
However the custom baggages myBaggageFilter and myBaggageController are not logged.
Interested fact is than commenting the reactor hook call Hooks.enableAutomaticContextPropagation() makes the custom baggages appear in the server logs, but disappear from the client logs.

@chemicL @marcingrzejszczak PTAL

@marcingrzejszczak
Copy link
Contributor

@grassehh can you please change the project to Java? Let's try fixing one problem at a time and adding Kotlin makes things more complicated. Thank you

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Jul 21, 2023
@violetagg
Copy link
Member

@grassehh Closing this issue. If you can provide the requested example in Java, showing the issue, we can reopen it.

@violetagg violetagg removed the for/user-attention This issue needs user attention (feedback, rework, etc...) label Aug 9, 2023
@grassehh
Copy link
Author

Hello @violetagg @marcingrzejszczak
I have pushed a reproducible Java sample on the issues/reactor-netty-java branch.

As you can see, the custom baggages are still not logged in the server request/response logs:

-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=76088eeb28e25736] - -2023-08-12T13:21:40.021+02:00 -TRACE 30009 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"remote","type":"request","correlation":"c1cd060e88cd982a","protocol":"HTTP/1.1","remote":"/127.0.0.1:55665","method":"GET","uri":"http://localhost:8080/test","host":"localhost","path":"/test","scheme":"http","port":"8080","headers":{"Accept":["*/*"],"Authorization":["XXX"],"Host":["localhost:8080"],"User-Agent":["curl/8.1.2"]}}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=a5644ccf373a5063, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.099+02:00 -DEBUG 30009 --- [     parallel-1] c.g.app.controller.SampleController      : Log from inside the controller. This should contain the baggage called myBaggageController
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=9fac80fd257c7ce6, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.281+02:00 -TRACE 30009 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"local","type":"request","correlation":"849bf9fd8dc5f3c8","protocol":"HTTP/1.1","remote":"google.com/142.250.179.110:443","method":"GET","uri":"https://google.com/","host":"google.com","path":"/","scheme":"https","port":"443","headers":{"accept":["*/*"],"accept-encoding":["gzip"],"b3":["64d76b4342b618d976088eeb28e25736-9fac80fd257c7ce6-1"],"host":["google.com"],"mybaggagecontroller":["6df0fe48-d37d-4ff0-ade9-92cf2bd1e187"],"mybaggagefilter":["b61464b0-3c63-4932-9b74-3aa0f461a2f7"],"user-agent":["ReactorNetty/1.1.8"]}}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=9fac80fd257c7ce6, myBaggageController=6df0fe48-d37d-4ff0-ade9-92cf2bd1e187, myBaggageFilter=b61464b0-3c63-4932-9b74-3aa0f461a2f7] - -2023-08-12T13:21:40.312+02:00 -TRACE 30009 --- [ctor-http-nio-4] org.zalando.logbook.Logbook              : {"origin":"remote","type":"response","correlation":"849bf9fd8dc5f3c8","duration":31,"protocol":"HTTP/1.1","status":301,"headers":{"Alt-Svc":["h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000"],"Cache-Control":["private, max-age=2592000"],"Content-Length":["220"],"Content-Security-Policy-Report-Only":["object-src 'none';base-uri 'self';script-src 'nonce-qINUne2hFIUMqhU2jOciHw' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp"],"Content-Type":["text/html; charset=UTF-8"],"Date":["Sat, 12 Aug 2023 11:21:40 GMT"],"Expires":["Sat, 12 Aug 2023 11:21:40 GMT"],"Location":["https://www.google.com/"],"P3P":["CP=\"This is not a P3P policy! See g.co/p3phelp for more info.\""],"Server":["gws"],"Set-Cookie":["CONSENT=PENDING+224; expires=Mon, 11-Aug-2025 11:21:40 GMT; path=/; domain=.google.com; Secure"],"X-Frame-Options":["SAMEORIGIN"],"X-XSS-Protection":["0"]},"body":"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>301 Moved</TITLE></HEAD><BODY>\n<H1>301 Moved</H1>\nThe document has moved\n<A HREF=\"https://www.google.com/\">here</A>.\r\n</BODY></HTML>\r\n"}
-MDC Context: [traceId=64d76b4342b618d976088eeb28e25736, spanId=76088eeb28e25736] - -2023-08-12T13:21:40.317+02:00 -TRACE 30009 --- [ctor-http-nio-3] org.zalando.logbook.Logbook              : {"origin":"local","type":"response","correlation":"c1cd060e88cd982a","duration":336,"protocol":"HTTP/1.1","status":200,"headers":{"Cache-Control":["no-cache, no-store, max-age=0, must-revalidate"],"content-length":["0"],"Expires":["0"],"Pragma":["no-cache"],"Referrer-Policy":["no-referrer"],"X-Content-Type-Options":["nosniff"],"X-Frame-Options":["DENY"],"X-XSS-Protection":["0"]}}

@violetagg violetagg reopened this Aug 12, 2023
@violetagg violetagg added for/stackoverflow Questions are best asked on SO or Gitter and removed for/stackoverflow Questions are best asked on SO or Gitter labels Aug 12, 2023
@violetagg
Copy link
Member

@marcingrzejszczak PTAL

@violetagg
Copy link
Member

violetagg commented Sep 5, 2023

@grassehh I would suggest to open an issue either in Spring or Micrometer. I don't know how Reactor Netty can help with custom baggages. However if the issue appears to be in Reactor Netty, we can always reopen the issue.

@marcingrzejszczak
Copy link
Contributor

Hey, I've updated Tracing version 1.1.x and 1.2.x with fixes related to baggage recently. Since this week is my Spring Cloud week I can look into checking this next week

@grassehh
Copy link
Author

grassehh commented Sep 5, 2023

Thanks. Should I still open an issue on Spring or somewhere else ? Or can I wait for your feedback next week here ?

@marcingrzejszczak
Copy link
Contributor

I mean you can bump tracing to the latest 1.1.x snapshot and check it for yourself or wait for a week for me to check it

@grassehh
Copy link
Author

I mean you can bump tracing to the latest 1.1.x snapshot and check it for yourself or wait for a week for me to check it

Hello, I have upgraded the dependencies on the issues/reactor-netty-java branch:
spring-boot 3.1.3
logbook 3.4.0
micrometer-tracing-bom 1.1.5

Baggage still are not logged in the Server request/response logs (only traceId and spanId)

@grassehh
Copy link
Author

grassehh commented Sep 15, 2023

Have you used the baggage API properly? Here you can find an example https://github.com/micrometer-metrics/tracing/blob/main/micrometer-tracing-bridges/micrometer-tracing-bridge-brave/src/test/java/io/micrometer/tracing/brave/bridge/BaggageTests.java#L180-L210

The way I do it (you can check in the issues/reactor-netty-java branch of my sample project) is using the Tracer.createBaggageInScope()method as the Tracer.createBaggage() is deprectated.
I have tried to configure the tracing like in your example with these two lines without success:

ContextRegistry.getInstance().registerThreadLocalAccessor(new ObservationAwareSpanThreadLocalAccessor(tracer));
Hooks.enableAutomaticContextPropagation();

So far like I said, the issue is that the custom bagage (created through Tracer.createBaggageInScope() method), do not show up in the server request/response logs (logged by the LogbookServerHandlerclass). They do show in the controller code and in the WebClientcalls though (logged by LogbookClientHandler).

This used to work well with Spring Sleuth, and it no longer works in Spring Boot 3 w/ Micrometer Tracing, since I replaced the deprecated ReactorNettyHttpTracing class with the standard HttpServer handler mecanism as suggested by violetagg.

@grassehh
Copy link
Author

Hi, @marcingrzejszczak @violetagg.
After some debugging, I noticed that SimpleObservationRegistry.getCurrentObservation() returns null inside the TracingChannelDuplexHandler.write() call. Not sure if this is normal ? Maybe the server handler is not correctly configured ?

The call stack is as follow:

write:41, TracingChannelDuplexHandler (com.grassehh.app.configuration)
setThreadLocalsFrom:109, DefaultContextSnapshotFactory (io.micrometer.context)
setAllThreadLocalsFrom:125, DefaultContextSnapshotFactory (io.micrometer.context)
setThreadLocal:101, DefaultContextSnapshot (io.micrometer.context)
getValue:32, ObservationThreadLocalAccessor (io.micrometer.observation.contextpropagation)
getValue:109, ObservationThreadLocalAccessor (io.micrometer.observation.contextpropagation)
getCurrentObservation:38, SimpleObservationRegistry (io.micrometer.observation)

@Ricard-Kollcaku
Copy link

the solution with TracingChannelDuplexHandler worked fine in spring boot 3.1* but upgrading it to 3.2.0 broke the solution, now the traceid in webclient is different from others any solution ?

@violetagg violetagg reopened this Dec 7, 2023
@violetagg
Copy link
Member

@Ricard-Kollcaku Please provide reproducible example.

@grassehh
Copy link
Author

grassehh commented Dec 8, 2023

the solution with TracingChannelDuplexHandler worked fine in spring boot 3.1* but upgrading it to 3.2.0 broke the solution, now the traceid in webclient is different from others any solution ?

@violetagg I've updated my issue with the same regression observed (reproducer here on branch issues/micrometer-tracing)

@violetagg
Copy link
Member

@grassehh @Ricard-Kollcaku Can you please test with Spring Framework 6.1.2-SNAPSHOT?

@grassehh
Copy link
Author

grassehh commented Dec 11, 2023

@grassehh @Ricard-Kollcaku Can you please test with Spring Framework 6.1.2-SNAPSHOT?

Confirm this works for me with 6.1.2-SNAPSHOT ! Thanks you !
Any ETA for the 6.1.2 version in Spring Boot ?

@violetagg violetagg removed the for/user-attention This issue needs user attention (feedback, rework, etc...) label Dec 12, 2023
@violetagg
Copy link
Member

violetagg commented Dec 12, 2023

@Ricard-Kollcaku I'm closing this one. Reactor Netty version for Spring Boot 3.x versions is one and the same - 1.1.x.
Please try the Spring Framework 6.1.2-SNAPSHOT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for/stackoverflow Questions are best asked on SO or Gitter
Projects
None yet
Development

No branches or pull requests

4 participants