Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pulsar apm plugin #3476

Merged
merged 32 commits into from
Sep 25, 2019
Merged

Add pulsar apm plugin #3476

merged 32 commits into from
Sep 25, 2019

Conversation

codelipenghui
Copy link
Contributor

@codelipenghui codelipenghui commented Sep 17, 2019

Please answer these questions before submitting pull request


New feature or improvement

  • Describe the details and related test reports.

Add pulsar apm plugin and unit tests, integration test will add soon.

Local tests result:

image

As the image shows, created 1 producer and 3 consumers with different subscription name(test-1, test-2, test-3).

Local integration test result:

[INFO] ---------------------------Here is the test report context -------------------------------------

Test Report - 2019-09-24 11:30

Cases List

Case description Status Cases
Pulsar 7 passed. 0 failed click me

Pulsar

Version Passed Failed
2.3.1 ✔️
2.3.0 ✔️
2.4.0 ✔️
2.4.1 ✔️
2.3.2 ✔️
2.2.1 ✔️
2.2.0 ✔️

@kezhenxu94 kezhenxu94 added agent Language agent related. feature New feature plugin Plugin for agent or collector. Be used to extend the capabilities of default implementor. labels Sep 17, 2019
@wu-sheng
Copy link
Member

@dmsolr Feel free to join.

@codelipenghui codelipenghui changed the title [WIP]Add pulsar apm plugin Add pulsar apm plugin Sep 18, 2019
@wu-sheng
Copy link
Member

W/ light4j plugin merged, you need to update the component definition because the ID has been used.

@wu-sheng
Copy link
Member

After you fix the conflicts and pass tests locally, please post your local test report here.

@codelipenghui
Copy link
Contributor Author

@wu-sheng I have fix the conflicts and past the local test screenshot to the description of this PR, please help take a look the test result screenshot.

@Override
public void handleMethodException(EnhancedInstance objInst, Method method, Object[] allArguments,
Class<?>[] argumentsTypes, Throwable t) {
ContextManager.activeSpan().errorOccurred().log(t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that Span starts after the method, so we will meet NPE when an exception happens before the Span created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move create active span logic to the beforeMethod, by the way, shall we need to catch the IllegalStateException here, because if no active span here will throw the IllegalStateException and i can't find a public method to check activeSpanStack is or not empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you dose not get my points.

In usually, We need not catch the IllegalStateException and need not check the size of the stack. But we have to make sure that the started and stopped of Span must present in pairs in an interceptor. In other word, which starts a Span must stop it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, We suggest doing like below.

if (ContextManager.isActive()) {
    ContextManager.activeSpan().errorOccurred().log(t);
}

The key point, we have to recheck by the same conditions as a span start when we wanna stop the span.

@Override
public Object afterMethod(EnhancedInstance objInst, Method method, Object[] allArguments, Class<?>[] argumentsTypes,
Object ret) throws Throwable {
ContextManager.stopSpan();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to ensure SPAN has been created will be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, to catch the IllegalStateException or is their any method to check the span is created

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if logically you could generate the span created in Pulsar plugin, it is OK to just stop span directly.

Because even we do isActive check, you could just check the span created by other plugins.

MethodInterceptResult result) throws Throwable {
SendCallbackEnhanceRequiredInfo requiredInfo = (SendCallbackEnhanceRequiredInfo) objInst.getSkyWalkingDynamicField();
if (null != requiredInfo.getContextSnapshot()) {
AbstractSpan activeSpan = ContextManager.createLocalSpan(OPERATION_NAME);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with Pulsar. In my experience, the LocalSpan can not always refer to ExitSpan, or it will miss directly sometimes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ExitSpan is closed and LocalSpan is still working, it may happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pulsar, i think the span of send callback is always a LocalSpan because in the exit span can guarantee that the callback is complete or complete with exception.

Here is the source of pulsar producer: https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java#L294

So, If I don't get it wrong, the SendCallback should create a LocalSpan. If there are any mistakes, please point them out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it. But I am thinking of another thing. In a scenario, SendCallBack.sendComplete is block till its parent span closed. At this moment, we call the ContextManager.continued(requiredInfo.getContextSnapshot()) maybe throw an exception.
Maybe we can use Asyn API.(I am not sure.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing the context here. Could you show me the thread workflow for your concern?

@codelipenghui
Copy link
Contributor Author

@dmsolr Thanks for your review, i have move the span creation of consumer to beforeMethod. Please help take a look again.

@wu-sheng
Copy link
Member

@codelipenghui Look like the CI and e2e failure caused by limited resource, es image can't be started in time. I will check this later today. You could focus on your local integration tests only for now.

@codelipenghui
Copy link
Contributor Author

@wu-sheng Ok, thanks. I'm doing the local integration tests now.

@codelipenghui
Copy link
Contributor Author

@wu-sheng I have paste the local integration test result to the description of this PR. Please help review the pulsar scenario integration.

@wu-sheng
Copy link
Member

@ascrutae Please run tests on our INFRA

@SkyWalkingRobot
Copy link

Here is the test report and validate logs

@wu-sheng
Copy link
Member

@codelipenghui Tests are 1 passed, 5 failure. Please check the log, we are running the cases parallel, is that some static port in your case?

@codelipenghui
Copy link
Contributor Author

@wu-sheng Currently, pulsar broker using static port, is that the error cause?

@wu-sheng
Copy link
Member

Currently, pulsar broker using static port, is that the error cause?

You should bind port by using docker compose. This is the cause.

@codelipenghui
Copy link
Contributor Author

codelipenghui commented Sep 24, 2019

Sorry, i'm lost here. I have bind port by using:

expose:
      - "8082"
ports:
      - "{SERVER_OUTPUT_PORT}:8082"

Pulsar broker and pulsar scenario is in one container, why pulsar broker also need to bind a port?

@SkyWalkingRobot
Copy link

Here is the test report and validate logs

Copy link
Member

@wu-sheng wu-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, including positive test results.

@dmsolr @kezhenxu94 Please recheck anf confirm

Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dmsolr
Copy link
Member

dmsolr commented Sep 25, 2019

LGTM

@wu-sheng wu-sheng added this to the 6.5.0 milestone Sep 25, 2019
@wu-sheng
Copy link
Member

Merging

@wu-sheng wu-sheng merged commit a874251 into apache:master Sep 25, 2019
@codelipenghui
Copy link
Contributor Author

@wu-sheng @dmsolr @kezhenxu94 Thanks for your help, very nice experience in Skywalking community.

@codelipenghui codelipenghui deleted the pulsar-plugin branch September 25, 2019 06:00
@wu-sheng
Copy link
Member

Welcome and look forward your storage proposal.

@wu-sheng
Copy link
Member

wu-sheng commented Dec 2, 2019

@codelipenghui The plugin test immigration is on going in these two months, a lot of the plugins have moved into PR related test process. #3583

Could any of you move the pulsar plugin test into the main repo too? The original plugin test env has been shut down. Moving into the main repo could make the pulsar plugin checked in every PR.

@codelipenghui
Copy link
Contributor Author

@wu-sheng Thanks for your reminder, will move the pulsar plugin test soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent Language agent related. feature New feature plugin Plugin for agent or collector. Be used to extend the capabilities of default implementor.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants