Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom APM Tracing #17947

Merged
merged 6 commits into from
Oct 18, 2022
Merged

Custom APM Tracing #17947

merged 6 commits into from
Oct 18, 2022

Conversation

jdpgrailsdev
Copy link
Contributor

What

  • Improve runtime performance visibility via APM tracing

How

  • Add custom APM traces to Temporal workflows in airbyte-workers
  • Add custom APM traces to Temporal activities in airbyte-workers
  • Add custom APM traces to scheduled tasks in airbyte-cron

Recommended reading order

  1. ApmTraceUtils.java
  2. *WorkflowImpl.java
  3. *ActivityImpl.java
  4. Temporal.java
  5. DefinitionsUpdater.java
  6. WorkspaceCleaner.java

Tests

  • Unit tests added for ApmTraceUtils.java
  • Project builds locally without issue
  • Project runs locally without APM agent present
  • Deployed to dev environment and configured with APM agent to confirm that traces are generated

N.B. There will be follow-on PR's to enable in cloud and to enable APM tracing in airbyte-container-orchestrator.

@github-actions github-actions bot added area/platform issues related to the platform area/worker Related to worker labels Oct 13, 2022
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 14:13 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 14:32 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 14:52 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 16:06 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 16:09 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 16:22 Inactive
Copy link
Contributor

@gosusnp gosusnp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tracing part is fairly manual, I wonder if we should plan for some abstractions to ease it up.
Also, part of it is a bit redundant in a way, I am thinking mostly of having the trace annotations then extra calls from the function to add the tags.

* @param tags A map of tags to be added to the currently active span.
* @param tagPrefix The prefix to be added to each custom tag name.
*/
public static void addTagsToTrace(final Span span, final Map<String, Object> tags, final String tagPrefix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can it be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benmoriceau In the case that the agent is not present, OpenTracing uses a no-op tracer that returns a null span. Therefore, this check is here for the OSS case where no APM agent has been provided.

@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 16:57 Inactive
@jdpgrailsdev
Copy link
Contributor Author

The tracing part is fairly manual, I wonder if we should plan for some abstractions to ease it up. Also, part of it is a bit redundant in a way, I am thinking mostly of having the trace annotations then extra calls from the function to add the tags.

@gosusnp Unfortunately, this is just how custom tracing works. The options are either annotation-based or to manually create a Span via the Tracer if you want a smaller scoped area that is traced than a full method. The tag bit cannot be applied via the annotation, so that has to be done manually as well.

Comment on lines +19 to +24
public static final String TAG_FORMAT = "airbyte.%s.%s";

/**
* Standard prefix for tags added to spans.
*/
public static final String TAG_PREFIX = "metadata";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesnodgrass Yes, as it is a class, not an interface. The classes that use this are not in the same package.

/**
* Collection of constants for APM tracing of Temporal activities and workflows.
*/
public final class TemporalTraceConstants {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants seem to fall into two separate categories; operation name and tag key. I'm wondering if we should separate these into two different classes or maybe two different inner classes within this TemporalTraceConstants.

This would have two benefits, the constants are more closely organized within their specific type (tag key vs operation name) and it would help to reduce the verbosity of the current constants while still keeping them within the same file.

TemporalTraceConstants.CONNECTION_ID_TAG_KEY -> TemporalTraceConstants.TagKey.CONNECTION_ID

and would be referenced by `TagKey.CONNECTION_ID` where used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought, we could flip the naming convention here so it becomes TAG_KEY_* instead of *_TAG_KEY to help logically keep all these similar constants grouped together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesnodgrass I don't feel strongly about any of this, so happy to re-organize. I am basically trying to avoid string duplication. That being said, they are all being used for tracing/APM purposes. I would tend to lean towards reversing the name order than having a separate class. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesnodgrass I went against my statement above and created a Tags class within the constants class.

@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 18:09 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 19:59 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 13, 2022 20:25 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 17, 2022 13:10 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 17, 2022 14:40 Inactive
@jdpgrailsdev jdpgrailsdev force-pushed the jonathan/workers-apm-trace branch from 2e85c3e to 3369c49 Compare October 18, 2022 16:00
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 18, 2022 16:02 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 18, 2022 16:17 Inactive
@jdpgrailsdev jdpgrailsdev temporarily deployed to more-secrets October 18, 2022 16:23 Inactive
@jdpgrailsdev jdpgrailsdev merged commit b4c16bf into master Oct 18, 2022
@jdpgrailsdev jdpgrailsdev deleted the jonathan/workers-apm-trace branch October 18, 2022 17:14
letiescanciano added a commit that referenced this pull request Oct 19, 2022
* master: (304 commits)
  Bump helm chart version reference to 0.40.27 (#18152)
  Bump helm chart version reference to 0.40.26 (#18094)
  Update deployment.yaml (#18151)
  Publishes Postgres, MySQL, MSSQL source with changes from #18041 (#18086)
  Fix minor DBT Cloud Errors. (#18147)
  Sentry Integration : Stop reporting all non system-error error types. (#18133)
  Docs: Fix backoff stategy docs (#18143)
  🐛 Destination GCS: Fix error logs to log 'Gcs' rather than 'AWS' (#17901)
  Add openAPI spec for Connector Builder Server (#17535)
  Alex/mvp UI for dbt cloud integration (#18095)
  increased timeout for sat tests (#18128)
  Bmoric/remove dep connector worker (#17977)
  `recordsRead` should be a long (#18123)
  doc_update_oath_issue_gsc (#17967)
  🎉 Source Zendesk Chat: engagements data fix infinity looping + gradlew format (#18121)
  🐛 Source Zendesk Chat: engagements data fix infinity looping (#17745)
  Custom APM Tracing (#17947)
  11679 BigQuery-Denormalized Destination: improve code coverage (#17827)
  increased timeout for sat tests (#18114)
  docs: clarify language (#18090)
  ...
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
* Add APM tracing to airbyte-workers

* Add custom traces to airbyte-cron

* Formatting

* Refactor tag constants

* Handle potential null object

* Handle null tag values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants