Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix instrumentation for Kafka Streams 2.6+ #2951

Merged
merged 2 commits into from
Jul 29, 2021

Conversation

tylerbenson
Copy link
Contributor

The method signature of what we're instrumenting changed in 2.6 from process() to process(long). By removing the no arg constraint, we now instrument either method.

I've also extended the test to verify the latest version of the library. In order to do so, I had to split the tests to resolve a compilation issue.

Inspired by open-telemetry/opentelemetry-java-instrumentation#3438. Thanks @GuillaumeWaignier!

The method signature of what we're instrumenting changed in 2.6 from `process()` to `process(long)`.  By removing the no arg constraint, we instrument either method.

I've also extended the test to verify the latest version of the library.
@richardstartin
Copy link
Member

What do the checkpoint timelines look like?

@tylerbenson
Copy link
Contributor Author

Here's output from running locally. Not quite sure how to interpret it though:

Activity checkpoints by thread ordered by time
Test worker:                                                                                                   |-startSpan/1-|-suspend/1-|----------|-----------|-------------|-----------|-------------|-------------|-----------|-----------|----------|-----------|----------|-----------|-------------|-----------|
kafka-producer-network-thread | producer-1:                                                                    |-------------|-----------|-resume/1-|-endSpan/1-|-------------|-----------|-------------|-------------|-----------|-----------|----------|-----------|----------|-----------|-------------|-----------|
test-application-278e57c6-a8cf-4522-8406-0cb56bebd042-StreamThread-1:                                          |-------------|-----------|----------|-----------|-startSpan/2-|-endSpan/2-|-startSpan/3-|-startSpan/4-|-suspend/4-|-endSpan/3-|----------|-----------|----------|-----------|-------------|-----------|
kafka-producer-network-thread | test-application-278e57c6-a8cf-4522-8406-0cb56bebd042-StreamThread-1-producer: |-------------|-----------|----------|-----------|-------------|-----------|-------------|-------------|-----------|-----------|-resume/4-|-endSpan/4-|-resume/3-|-endTask/3-|-------------|-----------|
-C-1:                                                                                                          |-------------|-----------|----------|-----------|-------------|-----------|-------------|-------------|-----------|-----------|----------|-----------|----------|-----------|-startSpan/5-|-endSpan/5-|

@richardstartin
Copy link
Member

richardstartin commented Jul 22, 2021

Span 3 should have been suspended on test-application-278e57c6-a8cf-4522-8406-0cb56bebd042-StreamThread-1 before being resumed on kafka-producer-network-thread | test-application-278e57c6-a8cf-4522-8406-0cb56bebd042-StreamThread-1-producer but this isn't a new problem.

@@ -103,7 +102,7 @@ public StopInstrumentation() {
@Override
public void adviceTransformations(AdviceTransformation transformation) {
transformation.applyAdvice(
isMethod().and(isPublic()).and(named("process")).and(takesArguments(0)),
isMethod().and(isPublic()).and(named("process")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question? Is this how we want to be handling this?

I'm concerned this opens us to overmatching if process is overloaded at a later time.
I think we'd be better off having a match for each signature that we want to instrument.

I'm interested to hear other opinions. I'm also not going to block this PR for this reason alone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There hasn't been much change to this method - previously it took no-arguments, now it takes a single long. Looking at the class and how this method is used I think it's unlikely that this method will be overloaded, but of course not impossible.

In other instrumentations we've generally gone for looser method matching if it's an internal method of an implementation class. If the advice was simple then I wouldn't be concerned - but this is one case where we're grabbing activeSpan / activeScope and trusting it is the right thing to finish/close. So if we did ever match overloaded (or 'bridge' methods) then this would risk closing both the scope we wanted to close as well as any surrounding scope...

So I would be happier to see more of an explicit match here - or alternatively an extra check in StopSpanAdvice that we're closing the right span/scope, even just a check of the type (note there is work planned to address this separately by adding methods to the scope manager, but that work still needs to be scheduled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you prefer if I changed it to an or that checks for either number of args?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that since we're going to fix the StopSpanAdvice soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated the matcher.

@tylerbenson tylerbenson merged commit ee5187d into master Jul 29, 2021
@tylerbenson tylerbenson deleted the tyler/kafka-streams-latest branch July 29, 2021 17:04
@github-actions github-actions bot added this to the 0.84.0 milestone Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants