Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/signalfx] Fix correlation timeout bug #9101

Merged
merged 9 commits into from
Apr 12, 2022

Conversation

crobert-1
Copy link
Member

@crobert-1 crobert-1 commented Apr 6, 2022

Description:
The SignalFx Exporter supports correlations between metrics and traces. This means that when a trace occurs from a given host, with dimensions like service and environment, these dimensions can be used to query metrics for these traces. These correlations are supposed to timeout after the stale_service_timeout amount of time passes, but they weren't. This fix makes sure correlations timeout so that metrics aren't linked with stale trace data.

Testing:
Manual testing is working as expected, but it's currently impossible (from my understanding) to write tests for this functionality. The problem is that the tests would need to create both metrics and traces to send to the signalfx backend, but at that high of a level the correlation values couldn't be properly checked, as they're not available outside the local package.

@crobert-1 crobert-1 requested a review from a team April 6, 2022 18:26
Copy link
Member

@pmcollins pmcollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of nits. Thanks for fixing this!

internal/coreinternal/timeutils/ticker_helper.go Outdated Show resolved Hide resolved
@crobert-1
Copy link
Member Author

Failing tests are unrelated to my changes

@crobert-1 crobert-1 changed the title Fix SignalFx Exporter correlation timeout bug [exporter/signalfx] Fix correlation timeout bug Apr 8, 2022
The SignalFx Exporter supports correlations between metrics
and traces. This means that when a trace occurs from a given
host, with dimensions like service and environment, these
dimensions can be used to query metrics for these traces.
These correlations are supposed to timeout after the
stale_service_timeout amount of time passes, but they
weren't. This fix makes sure correlations timeout so that
metrics aren't linked with stale trace data.
- Add public type comment
- Move implementing checker right under struct declaration
- Add spaces between functions
- Fix failing PR checks: Delay in tests failed in automation,
add license block in test file.
GitHub automation was failing unit tests because of a race
detected. The solution was to add a mutex to checking and
setting the variable in question.
@crobert-1 crobert-1 force-pushed the sfx_export_correlate_fix branch from 6e7f664 to 4016993 Compare April 11, 2022 18:06
Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dmitryax dmitryax added the ready to merge Code review completed; ready to merge by maintainers label Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants