Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[connector/servicegraphconnector] Adds a new config option allowing an extra label for virtual node identification #32826

Merged
merged 26 commits into from
Jun 20, 2024

Conversation

t00mas
Copy link
Contributor

@t00mas t00mas commented May 2, 2024

Description:

  • A new configuration option enable_virtual_node_label is added. When enabled, it will add an extra label to the metrics, virtual_node, indicating if the client or the server was the missing span.

Link to tracking Issue: 31889

Testing:

  • Adding tests with github.com/open-telemetry/opentelemetry-collector-contrib/pkg/golden.
  • Testing the addition of virtual_node labels to the metrics, as well as connection_type: virtual_node (as I think it was untested before).
  • Also testing extra dimensions regarding client/server attributes.

Documentation:

  • Added the new configuration option to the README

Note to reviewers

I have a suspicion that connection_type: virtual_node was broken. There was a race condition on store.Expire so the expired edges were completing after the metrics were built. I have added a p.store.Expire() just at the start of flushMetrics to allow these edges to contribute to the metrics. Please advise if you see something wrong with this approach.

@t00mas t00mas requested a review from jpkrohling as a code owner May 2, 2024 15:22
@t00mas t00mas requested a review from a team May 2, 2024 15:22
Copy link

linux-foundation-easycla bot commented May 2, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Comment on lines +418 to +397
if p.config.VirtualNodeExtraLabel {
dimensions = addExtraLabel(dimensions, virtualNodeLabel, string(e.VirtualNodeLabel))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand we only want to add this label if the edge has a virtual node, right? We should check that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be - it's an easy addition, we probably just change to:

if e.ConnectionType == store.VirtualNode && p.config.VirtualNodeExtraLabel {

however, is it better to not have it, or to have virtual_node: with an empty value when this option is enabled?

I'm leaning towards having the label anyway with an empty value, but not 100% convinced of that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be confusing, but I don't have a strong opinion on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then, I'll opt for leaving it with an empty value sometimes but always present. We can review this if it's an issue.

@@ -215,6 +216,8 @@ func (p *serviceGraphConnector) metricFlushLoop(flushInterval time.Duration) {
}

func (p *serviceGraphConnector) flushMetrics(ctx context.Context) error {
p.store.Expire()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edges are already expired in a different goroutine, why force expiration here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I saw, but without it (or because it's a different goroutine), the shutdown and metrics collection will happen before the edge expire&complete, so metrics from pending edges are not collected (they won't be added to .reqTotal for example).

Do you know if there's a better way to test this?

PS: This is unrelated to the addition of the new label, I realized when I was adding a test for connection_type: virtual_node.

Copy link
Contributor Author

@t00mas t00mas May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK fixed, I overlooked a minor detail about the metric flush loop, so test has to allow it. No more double-expire here.

@t00mas t00mas force-pushed the extra-labels-for-service-graphs branch from b878b35 to 1ad0509 Compare May 13, 2024 13:11
@t00mas t00mas requested a review from mapno May 13, 2024 13:19
Copy link
Contributor

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something change?

@t00mas
Copy link
Contributor Author

t00mas commented May 14, 2024

Did something change?

Nope, just fixed a test signature after a rebase.

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than using latest versions for things from contrib, this LGTM.

@@ -3,6 +3,8 @@ module github.com/open-telemetry/opentelemetry-collector-contrib/connector/servi
go 1.21.0

require (
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/golden v0.99.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry about this, but this needs a bump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated

@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label May 16, 2024
@jpkrohling jpkrohling force-pushed the extra-labels-for-service-graphs branch from 4c76439 to 58520f9 Compare May 21, 2024 19:38
@jpkrohling jpkrohling force-pushed the extra-labels-for-service-graphs branch from 58520f9 to cff100e Compare May 21, 2024 19:40
Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
@jpkrohling jpkrohling merged commit e362c5d into open-telemetry:main Jun 20, 2024
154 checks passed
@github-actions github-actions bot added this to the next release milestone Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants