Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default otlp exporter GRPC load balancer to round robin #10319

Merged
merged 9 commits into from
Jul 2, 2024

Conversation

taniyourstruly
Copy link
Contributor

@taniyourstruly taniyourstruly commented Jun 4, 2024

Description

Updates the default pick_first load balancer to round_robin, which allows for better resource allocation and less chances of throttling data being sent to addresses.

Link to tracking issue

Fixes #10298 (has full context of this PR)

Testing

Edited tests to allow round_robin load balancing.

@taniyourstruly taniyourstruly marked this pull request as ready for review June 5, 2024 16:26
@taniyourstruly taniyourstruly requested review from a team and jpkrohling June 5, 2024 16:26
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I agree with having round_robin as the default, we try to stick with the default values from the underlying libraries.

@jaronoff97
Copy link
Contributor

@jpkrohling can you explain the reasoning behind that for this specific scenario? Per Tani's issue: pick_first is never the right choice:

Also, pick_first does no actual load balancing (link), and instead just tries each address from the name resolver and connects to the one that works.

gRPC can't change their default because of legacy clients they need to support / the edge cases are much larger. Given that the collector serves to send data to an arbitrary backend, and most destinations being sent to would benefit from more evenly balanced load, I'm not sure why we would stick with a choice that balances load poorly. I would argue that today we are not setting a reasonable default for users.

@jpkrohling
Copy link
Member

The base for the reasoning is that we try to keep our config with the value that would cause the least surprise to our users, and that's typically the default we have from the underlying library. Another example that I prefer a different default is for the Min TLS version, which is (IIRC) 1.2 from Go, but we could have it set to 1.3 by default.

That said, I'm open to changing the default value for this, deviating from the underlying library, if we have OKs from maintainers (cc @open-telemetry/collector-maintainers ).

@codeboten
Copy link
Contributor

As much as I agree with having round_robin as the default, we try to stick with the default values from the underlying libraries.

I would be ok with changing the default and documenting the reasoning for it in the components affected by this change. I think it's desirable for end users that the collector provides defaults that make their experience better. As per the research done in #10298, I think this qualifies.

Copy link

codecov bot commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.39%. Comparing base (3996c58) to head (a8ed955).
Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10319      +/-   ##
==========================================
- Coverage   92.55%   92.39%   -0.16%     
==========================================
  Files         387      387              
  Lines       18284    18313      +29     
==========================================
- Hits        16922    16921       -1     
- Misses       1017     1046      +29     
- Partials      345      346       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I wanted to settle which strategy we want to make the default: @tsloughter suggested random instead. Once we settle that, we are ready to change the default value.

.chloggen/load-balancer-round-robin.yaml Outdated Show resolved Hide resolved
.chloggen/load-balancer-round-robin.yaml Outdated Show resolved Hide resolved
config/configgrpc/configgrpc.go Outdated Show resolved Hide resolved
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the configgrpc README with the default and maybe a link to the issue that drove the change

.chloggen/load-balancer-round-robin.yaml Show resolved Hide resolved
taniyourstruly and others added 2 commits June 12, 2024 12:07
Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this change, and for persisting in pursuing what you considered the right thing despite my initial reaction.

@codeboten codeboten merged commit 6ad6b86 into open-telemetry:main Jul 2, 2024
49 checks passed
@github-actions github-actions bot added this to the next release milestone Jul 2, 2024
@taniyourstruly taniyourstruly deleted the config-grpc branch July 2, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change default otlp exporter GRPC load balancer to round robin
4 participants