Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream Connection Rebalancing #176

Open
andydunstall opened this issue Nov 12, 2024 · 1 comment
Open

Upstream Connection Rebalancing #176

andydunstall opened this issue Nov 12, 2024 · 1 comment

Comments

@andydunstall
Copy link
Owner

Say you have a cluster with 3 nodes, where each node has 1000 upstream connections. If those nodes are becoming overloaded you may increase the number of replicas (either manually or with autoscaling).

Currently, if you add 3 more nodes, you'll end up with 3 nodes having 1000 upstream connections and 3 nodes with 0 upstream connections. Therefore Piko should rebalance upstream connections.

As Piko is designed to be hosted behind a load balancer, if a node drops the connection to an upstream service, that service will reconnect to a random node. Therefore when nodes find they have far more connections than the average for the cluster, then can gradually shed connections to upstreams which will then reconnect to a random node, rebalancing the cluster.

Such as in the above example, the average number of connections across the 6 nodes is 500, but the first three nodes all have 1000 connections each. The threshold and rate of shed connections can be configurable, such as shedding if you have 20% more connections than the cluster average, and shedding 0.5% of connections every second.

@andydunstall
Copy link
Owner Author

andydunstall commented Dec 23, 2024

So this can include server configuration:

  • --rebalance.threshold: Threshold to start rebalancing, where each node will rebalance if it's number of upstream connections exceeds the cluster average by the threshold (probably with a default of 0 to disable rebalancing)
    • This should also avoid shedding if a node has very few connections (such if theres only 5 connections in the cluster don't rebalance)
  • --rebalance.shed-rate: Rate each node should shed connections if it exceeds the threshold (default 0.005 meaning shed 0.5% of connections per second?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant