Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support opentelemetry for grpc tracing #7539

Merged
merged 6 commits into from
Apr 26, 2022

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Oct 24, 2021

Fixes #4972

This pr adds support for basic grpc level tracing using otelgrpc to send traces to opentelemetry collector.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • Optional. My organization is added to USERS.md.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).

@yeya24 yeya24 force-pushed the support-otelgrpc branch 2 times, most recently from 19cc92c to 5d7c1ab Compare October 24, 2021 19:05
@yeya24
Copy link
Contributor Author

yeya24 commented Oct 24, 2021

Looks like it is not working due to the grpc version upgrade. Will try to resolve it.

@yeya24 yeya24 marked this pull request as draft October 25, 2021 02:48
@yeya24 yeya24 marked this pull request as ready for review October 25, 2021 08:20
@yeya24
Copy link
Contributor Author

yeya24 commented Oct 25, 2021

I can see grpc server tracing is working now. There is still some problems about the protobuf version and once I fix it I will mark the pr as ready.

image

@yeya24 yeya24 marked this pull request as draft October 25, 2021 08:26
cmd/util/trace.go Outdated Show resolved Hide resolved
@yeya24
Copy link
Contributor Author

yeya24 commented Feb 12, 2022

Close this for now and hopefully we can work back on this once grpc version is newer

@yeya24 yeya24 closed this Feb 12, 2022
@leoluz leoluz mentioned this pull request Mar 18, 2022
10 tasks
@leoluz
Copy link
Collaborator

leoluz commented Apr 18, 2022

Reopening this as we now have updated gRPC runtime library to the latest version (v1.45.0) in ArgoCD.

@leoluz leoluz reopened this Apr 18, 2022
Copy link
Collaborator

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert some of the changes (I added a proper comment in those) before rebasing to avoid dealing with unnecessary conflicts.

cmd/argocd-server/commands/argocd_server.go Outdated Show resolved Hide resolved
cmd/util/trace.go Outdated Show resolved Hide resolved
cmd/util/trace.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
pkg/apiclient/grpcproxy.go Outdated Show resolved Hide resolved
cmd/util/trace.go Outdated Show resolved Hide resolved
cmd/argocd-server/commands/argocd_server.go Outdated Show resolved Hide resolved
)

func InitTracer(serviceName, otlpAddress string) (trace.TracerProvider, func(), error) {
ctx := context.Background()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably receive the context from the main server build the resource with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but it seems that this is still not using the context from the server.
This is the context that I was referring to:

ctx, cancel := context.WithCancel(ctx)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that. One question here, why do we need the for loop?

, can I move the ctx out of the loop because I don't want to put the init tracer function in the loop

Copy link
Collaborator

@leoluz leoluz Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for-loop is mainly to automatically restart the server if it eventually stops for any reason. You can not move the ctx out of the for loop. Note that argocd.Run(...) is a blocking call. Ideally the initialization of the tracer should be inside the Run() function just like how it is done for the prometheus metric server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed reply. Then I think this is fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but you can't move the ctx out of the for-loop. A new context need to be build on every iteration and the initTracer call needs to happen somewhere inside this loop.

Ben Ye added 2 commits April 19, 2022 11:03
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
@yeya24 yeya24 force-pushed the support-otelgrpc branch from 5d7c1ab to c9c86b8 Compare April 19, 2022 18:53
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
@yeya24 yeya24 marked this pull request as ready for review April 19, 2022 19:10
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
@codecov
Copy link

codecov bot commented Apr 19, 2022

Codecov Report

Merging #7539 (26f3cf4) into master (c7ff388) will decrease coverage by 0.13%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #7539      +/-   ##
==========================================
- Coverage   45.63%   45.50%   -0.14%     
==========================================
  Files         217      219       +2     
  Lines       25696    25895     +199     
==========================================
+ Hits        11727    11783      +56     
- Misses      12337    12473     +136     
- Partials     1632     1639       +7     
Impacted Files Coverage Δ
server/server.go 54.23% <100.00%> (-1.31%) ⬇️
applicationset/services/scm_provider/github.go 63.52% <0.00%> (-17.65%) ⬇️
util/cert/cert.go 82.40% <0.00%> (-2.35%) ⬇️
util/settings/settings.go 48.10% <0.00%> (ø)
server/rbacpolicy/rbacpolicy.go 82.35% <0.00%> (ø)
cmd/argocd/commands/admin/settings_rbac.go 23.24% <0.00%> (ø)
...is/applicationset/v1alpha1/applicationset_types.go 47.22% <0.00%> (ø)
server/application/websocket.go 8.33% <0.00%> (ø)
server/application/terminal.go 3.60% <0.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c7ff388...26f3cf4. Read the comment docs.

Signed-off-by: Ben Ye <ben.ye@bytedance.com>
Comment on lines 167 to 168
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be moved outside this loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this cannot be moved, then the InitTracer call should be inside the loop as well? Because it needs the server context

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. A new context needs to be built on every loop iteration and the InitTracer call needs to happen somewhere inside this loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little bit weird to me. But anyway I updated the code.

Signed-off-by: Ben Ye <ben.ye@bytedance.com>
@yeya24
Copy link
Contributor Author

yeya24 commented Apr 25, 2022

Hello @leoluz, is there any action item left for this PR?

@leoluz
Copy link
Collaborator

leoluz commented Apr 25, 2022

Hello @leoluz, is there any action item left for this PR?

@yeya24 Added one more comment.
Otherwise LGTM!

Copy link
Collaborator

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@leoluz leoluz merged commit 09e5b60 into argoproj:master Apr 26, 2022
@yeya24 yeya24 deleted the support-otelgrpc branch April 27, 2022 04:39
@leoluz leoluz added this to the v2.4 milestone Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add OpenTelemetry tracing integration
4 participants