-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better RTT and lower cloud costs by preferencing lower latency members #15918
Comments
Alright, so I forked etcd and switched over the default client lb config to use This works, and the ordering specified in So yeah, I feel a new lb configuration is required and we can't rely on pick first + configuration changes to ordering of members. |
I would prefer to delegate loadbalancing algorithm implementation to grpc as we are already planning to migrate to it #15145 |
Ah nice, are you referring to this? https://github.com/grpc/grpc/blob/master/doc/grpc_xds_features.md |
No, I was just saying that in the best case etcd project should not implement loadbalancing, just provide a saine default and allow users to configure grpc themselves (pass flags via options). |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
What would you like to be added?
Problem statement
etcd clients (such as kube-apiserver) will use round robin to select a member to connect to:
etcd/client/v3/internal/resolver/resolver.go
Lines 42 to 44 in 7e161d5
This load balancing configuration has a downside in a HA cloud environment where cross zone traffic is metered as the apiserver will possibly connect to a member in another zone, which in turn replicates to members in other zones.
The other load balancing configurations are not suitable either, for example pick-first will serially connect to each member, and as the name implies, pick the first which connects. This would require ordering the endpoints which could be done if the order of the
--etcd-servers
in `kube-apiserver' order were retained (will need to test).Why is this needed?
I believe a new load balancing configuration which prioritises members with lowest latency is a sensible default option for etcd.
This screenshot shows a graph of RTT between each zone as I trialed the
pick-first
configuration. You can see that the 9 lines (one for each relationship between zones) drops to 3 (as there are only 3 zones and traffic is no longer leaving the zone), plus the large reduction in RTT from the apiserver to etcd.apiservers
in a 3 member setup (1 member and 1 client in each zone)Note this is just some rough napkin maths and I could be way off, regardless, I believe this feature would be beneficial.
The text was updated successfully, but these errors were encountered: