Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/loadbalancing] Documents a workaround for load balancing on other attributes #37494

Conversation

swar8080
Copy link
Contributor

Description

When using the load balancing exporter to produce span metrics, we wanted to load balance on an attribute other than service.name because it doesn't spread our load that well. Figured it might be useful to document the workaround for others

Looks like #33660 is tracking the enhancement that'd make this workaround unnecessary

Testing

We use a similar set-up in production. Also used www.otelbin.io to validate the example configuration

@jpkrohling
Copy link
Member

Can't you load-balance based on the streamID?

@swar8080
Copy link
Contributor Author

swar8080 commented Feb 8, 2025

Can't you load-balance based on the streamID?

Hi @jpkrohling , our use-case is generating cumulative temporality span metrics. We use the load balancing exporter to avoid violating the single-writer principle, since it helps us get all of a service/pod's spans on a single collector when we generate its span metrics.

I wasn't actually aware of streamID, but since its only available for metrics something like this could be worth documenting instead?:

  1. First layer of collectors are randomly assigned spans and generate delta temporality span metrics
  2. Load balancing of span metrics is done by streamID
  3. Second layer of collectors uses deltatocumulativeprocessor

Sounds like that alternative would also spread the load more evenly

@swar8080
Copy link
Contributor Author

swar8080 commented Feb 9, 2025

Actually, deltatocumulativeprocessor would cause problems since the timestamp of the delta span metrics would be out of order:

image

So I think span metric generation has to happen after load balancing a resource's spans to the same collector.

I could also add this example config in the spanmetricsconnector README if you think that's a better fit. This took us a couple iterations to get right when we were less familiar with metric temporality, so I think documenting it somewhere would help others trying to scale their span metric set-ups

@jpkrohling
Copy link
Member

This took us a couple iterations to get right when we were less familiar with metric temporality

If you could write the docs in a way that would have made your lives easier, that's be wonderful!

Copy link
Contributor

github-actions bot commented Mar 1, 2025

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Mar 1, 2025
@atoulme
Copy link
Contributor

atoulme commented Mar 16, 2025

Given that #33660 is now resolved, this workaround is likely no longer required. I am going to close this PR. Please reopen if I am incorrect and more work is required here.

@atoulme atoulme closed this Mar 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants