-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ThreadLocals not cleared with Hooks.enableAutomaticContextPropagation()
, same TraceId for every request
#363
Comments
without this change we're clearing the scopes map upon ending of spans. The problem is that the scoping hierarchy gets broken and even though that OTLA works well (scopes are properly opened and closed) the spans are not. with this change we're not removing the scopes upon clearing of a TraceContext fixds gh-363
Fixed via 5a750bb |
Looking at the commit that resolved this, it doesn't appear there's anything preventing a regression like this from sneaking back in. |
Yes, we should definitely do that. I've opened #397 to track it. Pull requests welcome. |
Think I am also seeing this but with Spring Cloud Gateway and latest Micrometer Tracing 1.1.6: #410 |
@vpavic #421 is my proposed test for this case. @shakuzen @jonatan-ivanov I tested it against 1.1.5 and it fails due to the scope invalidation following an Observation stop signal. |
Thanks for the ping @chemicL. Test case in your PR is focused on Reactor scenario, and while this issue also is, the root cause behind it really isn't tied to Reactor and it affected a broader set of users (we're not using Reactor in any of our projects). Not a critique of your PR by any means just an observation that the tests covering this should probably also include something that's not specific to a certain runtime. |
@vpavic understood :) The interesting thing about it is the unique setup which reactor creates with the support for context propagation that we introduced. It was hard to replicate without the reactor in the equation, yet not impossible. Have you experienced this bug without reactor? I'd argue here that using reactor is both a means to an end – creating an environment in which we can reliably replicate the issue, and a validation that where the bug has surfaced is also covered by the tests. If you have an idea that avoids using reactor in the test for this feature, I think it would be valuable to add. |
I originally opened this issue in reactor: reactor/reactor-core#3584, but it turned out that this is an issue in micrometer-tracing instead.
Worked in: 1.1.4
Broken in: 1.1.5, 1.1.6-SNAPSHOT, 1.2.0-SNAPSHOT
Also related: #356
Expected Behavior
Automatic context propagation (enabled with
Hooks.enableAutomaticContextPropagation()
) should clear the thread-locals afterwards.Actual Behavior
After the first request handled by a particular thread, every subsequent request handled by that thread will reuse the same traceId.
Steps to Reproduce
Demo project: reproducer.zip
Run it and issue a few requests to
localhost:8080
, the output will be something like this:With
reactor.netty.ioWorkerCount=1
:With
reactor.netty.ioWorkerCount=2
:With
reactor.netty.ioWorkerCount=3
:As we can see, requests handled by a thread will always reuse the same traceId, indicating that thread-locals popuplated by automatic context propagation are not cleared/reset after each request.
Without automatic context propagation,
tracer.currentSpan()
returnsnull
.Your Environment
netty
, ...):java -version
):The text was updated successfully, but these errors were encountered: