Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logging integration #500

Merged
merged 4 commits into from
Mar 5, 2019

Conversation

eyalkoren
Copy link
Contributor

@eyalkoren eyalkoren commented Feb 25, 2019

Fixes #499 and also enables proper logging integration for async spans/transactions as well.
In the sake of simplification, I added a span.id in all cases (even from transactions), so ID will be duplicated in logs created when the active context is a transaction. This means that log lines will be consistent for all types, so aggregations done for span.id will include transaction-related events as well. We need to decide whether this is a good thing (supports aggregations for all spans AND transactions easily) or a bad thing (doesn't support aggregations for spans AND NOT transactions).

@codecov-io
Copy link

codecov-io commented Feb 25, 2019

Codecov Report

Merging #500 into master will decrease coverage by 1.32%.
The diff coverage is 73.68%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #500      +/-   ##
============================================
- Coverage     66.01%   64.68%   -1.33%     
  Complexity       68       68              
============================================
  Files           174      168       -6     
  Lines          7176     6425     -751     
  Branches        868      748     -120     
============================================
- Hits           4737     4156     -581     
+ Misses         2151     2007     -144     
+ Partials        288      262      -26
Impacted Files Coverage Δ Complexity Δ
...lastic/apm/agent/impl/context/AbstractContext.java 77.77% <ø> (ø) 0 <0> (ø) ⬇️
...apm/agent/impl/transaction/TraceContextHolder.java 55.26% <ø> (-7.49%) 0 <0> (ø)
...astic/apm/agent/impl/transaction/TraceContext.java 89.2% <0%> (+0.1%) 0 <0> (ø) ⬇️
...lastic/apm/agent/servlet/AsyncInstrumentation.java 64.86% <0%> (+1.7%) 0 <0> (ø) ⬇️
.../apm/agent/concurrent/ExecutorInstrumentation.java 43.75% <0%> (ø) 0 <0> (ø) ⬇️
...ent/report/serialize/MetricRegistrySerializer.java 86% <100%> (ø) 0 <0> (ø) ⬇️
...gent/impl/async/ContextInScopeRunnableWrapper.java 69.23% <100%> (ø) 0 <0> (ø) ⬇️
...gent/impl/async/ContextInScopeCallableWrapper.java 68% <100%> (ø) 0 <0> (ø) ⬇️
...astic/apm/agent/impl/transaction/AbstractSpan.java 90.56% <100%> (-3.34%) 0 <0> (ø)
...m/agent/impl/async/SpanInScopeRunnableWrapper.java 64.28% <100%> (ø) 0 <0> (ø) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae57423...6fc8580. Read the comment docs.

public void onDeactivate() throws Throwable {
if (config != null && config.isLogCorrelationEnabled()) {
MethodHandle remove = mdcRemoveMethodHandleCache.get(Thread.currentThread().getContextClassLoader());
public void onDeactivate(TraceContextHolder<?> context) throws Throwable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't add the TraceContextHolder here as it might already be ended and recycled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is called from TraceContextHolder#deactivate, when the this instance is still the active one on the thread stack, so I it is very valid to for checking this state. You are right however that I shouldn't observe it, so I shouldn't call isSampled for example.
So, I will add to documentation that I can only use the reference but not observe and certainly not update its state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to transfer on the TraceContextHolder instance hash code...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right however that I shouldn't observe it, so I shouldn't call isSampled for example.

By providing the context as an argument, we are more likely to make mistakes as that. We should only add the argument if the need for it actually arises. Even then it might more sense to just have the identity hash as a parameter.

MethodHandle remove = mdcRemoveMethodHandleCache.get(Thread.currentThread().getContextClassLoader());
public void onDeactivate(TraceContextHolder<?> context) throws Throwable {
if (config != null && config.isLogCorrelationEnabled() && context.isSampled()) {
String formerSpanId = spanIdStack.get().poll();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of maintaining another stack here, couldn't we just get tracer.getActive() and call isSampled() and traceContext.getId().toString() on that instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contradicts what you mentioned in the above comment :)
It may be already ended and recycled, so not allowed to observe at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was to get the parent of the context which is being deactivated. You reversed the order in the PR so that it deactivates after calling the listeners. I think we should make it more obvious from the method name if the deactivation happens before or after. For example we should call it either beforeDeactivate or afterDeactivate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see.
To me, the right way to undo/reverse A -> B -> C -> D is !D -> !C -> !B -> !A. So, for this matter, if A is TraceContextHolder#activate and B is ActivationListener#onActivate, It seemed to make sense to reverse the order.
What if I change ActivationListener#onActivate to beforeActivate and ActivationListener#onDeactivate to afterDeactivate? Then I can remove the additional stack as you suggest and have consistent code in both methods:

    public void beforeActivate(TraceContextHolder<?> context) throws Throwable {
        ...
        if (tracer != null) {
            if (tracer.getActive() == null) {
                put.invokeExact(TRACE_ID, traceContext.getTraceId().toString());
                put.invokeExact(TRANSACTION_ID, traceContext.getTransactionId().toString());
            }
            put.invokeExact(SPAN_ID, traceContext.getId().toString());
        }
    }

    public void afterDeactivate(TraceContextHolder<?> context) throws Throwable {
        ...
        if (tracer != null) {
            TraceContextHolder active = tracer.getActive();
            if (active == null) {
                remove.invokeExact(TRACE_ID);
                remove.invokeExact(TRANSACTION_ID);
            } else if (active.isSampled()) {
                put.invokeExact(SPAN_ID, active.getTraceContext().getId().toString());
            }
        }
    }

@eyalkoren
Copy link
Contributor Author

We decided to:

  1. log info even for unsampled events
  2. log the span ID for transactions

@eyalkoren eyalkoren merged commit 715f6db into elastic:master Mar 5, 2019
@eyalkoren eyalkoren deleted the fix-logging-instrumentation branch March 5, 2019 13:20
@SylvainJuge SylvainJuge added bug Bugs and removed type: bug labels Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No MDC Data in logs with enable_log_correlation=true after a RestTemplate call
6 participants