Load balancing pool for HTTP/1.1 and HTTP/2 #3505

jrudolph · 2020-10-07T08:50:30Z

Background Information

Load balancing clients are a de-facto standard in data center environments. Compared to load balancers for external traffic that often comes in through a central point in the infrastructure, client-based load-balancing inside the data center is possible under different preconditions and with different advantages:

clients have access to internal infrastructure information (number of backend servers, addresses)
clients have low latency to backend servers
clients and servers can make more assumptions about each other

One main property is that load balancing logic is distributed to the clients (with all the advantages and disadvantages it brings).

(In managed environments, the load balancing logic might be implemented inside of a service mesh in which case a client will not need its own balancing logic)

Some references:

Akka HTTP Implementation Ideas

One main question is where to put load balancing logic in the Akka HTTP client implementation stack:

it could be part of the pool (for HTTP/1.1: different slots connect to different endpoints, distribute work by distributing to slots, for HTTP/2: ?)
it could be a component on top of the pool (one pool for each backend with a component on top for routing requests)

I currently favor the second option mainly because of separation of concerns: the pool handles the connection lifecycle and slot management, the load balancing component would handle the distribution of work. Open questions:

One potential challenge might be that since lifecycle management is abstracted away, the load balancing component might be missing information it needs for distributing work (e.g. when a backend goes away and all connections break, the load balancer might notice only later on).

Marcus-Rosti · 2020-10-19T20:28:27Z

@jrudolph I've seen this issue on scale ups that my service will still only request the original set. client calls 5 backend servers, the server has more load and scales to 6, but the original 5 are the only that get traffic.

When I 'kill' one of the services it rebalances across all the nodes

Marcus-Rosti · 2020-10-19T20:30:56Z

in akka-grpc ^^^
re: @raboof

jrudolph · 2020-10-20T07:42:42Z

@jrudolph I've seen this issue on scale ups that my service will still only request the original set. client calls 5 backend servers, the server has more load and scales to 6, but the original 5 are the only that get traffic.

When I 'kill' one of the services it rebalances across all the nodes

Interesting. In some way, the behavior makes sense: you don't want to query the set of backend servers for every request but need some kind of trigger to query what the current set of backend servers is. You could do it regularly or wait for some event like a server going down.

So far akka-grpc uses the grpc-java client, so that's something we can only fix here once we have an akka-http client backend for akka-grpc.

Marcus-Rosti · 2020-10-20T16:55:53Z

The way we've solved it is to pass in a name resolver https://grpc.github.io/grpc-java/javadoc/io/grpc/NameResolver.html that continually pings the kubernetes api for the ips of the pods that come up or down and replacing them reactively. But like you said that backend service is NOT designed for that.

I was trying to figure out a way to either use the akka-management kubernetes module to do it but it has the same problem where it queries only when a pod goes away. The other idea I had was calling https://github.com/akka/akka-grpc/blob/88252782b64809d3d44d7510f2e648c13aa5aa96/runtime/src/main/scala/akka/grpc/internal/AkkaDiscoveryNameResolver.scala#L36 but as far as I can tell this isn't used anywhere that I can interact with it.

Anyway, I don't know what solution works best in a library management sense but it's something I've been thinking about.

ignasi35 · 2020-10-20T20:23:10Z

it could be a component on top of the pool (one pool for each backend with a component on top for routing requests)

+1 to using this approach.

Supporting client load balancing opens the door for a huge list of requirements and customizations. A pluggable layer on top of the pool(s) allowing users to implement routing logic sounds like the best way forward.

Take for example the options introduced in gRPC where clients may: (1) consume load reports from the server (on a side-channel), or even (2) defer the routing decisions to an external component (aka, Lookaside LB). I am not saying we should implement any of this but just provide the infrastructure for people to support them.

ignasi35 · 2020-10-20T20:30:51Z

(continuing..., should have been a single comment)
Another consideration is whether such a component should bring circuit-breaking out of the box or not.

There are two options:

add a circuit breaker per remote server
protect the whole remote service behind a single circuit breaker

The options above are not either-or, though. In any case, if we were to add such a feature in the new client-side component I think it should be part of a reference implementation and not the pluggable layer.

Summing up, I think we should have a component that given a request passes the request to the appropriate pool following some externally-plugged logic (sticky session, load balancing, ...). Separately, we should provide a single implementation or some basic implementations for the initial use cases.

jrudolph added 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module labels Oct 7, 2020

jrudolph mentioned this issue Oct 7, 2020

Support client-side load balancing with the connection pool #2828

Open

Marcus-Rosti mentioned this issue Oct 20, 2020

Configure a way call the refresh api AkkaDiscoveryNameResolver preiodically or on event changes in the DNS akka/akka-grpc#1152

Closed

jrudolph self-assigned this Nov 3, 2020

jrudolph added 3 - in progress Someone is working on this ticket and removed 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted labels Nov 3, 2020

jrudolph mentioned this issue Nov 18, 2020

HTTP/2 client: integrate into shared pool API #3231

Open

raboof mentioned this issue Nov 18, 2020

HTTP/2 client-side meta #3226

Open

8 tasks

jrudolph mentioned this issue Nov 18, 2020

Discuss: issues with streaming client components #3639

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load balancing pool for HTTP/1.1 and HTTP/2 #3505

Load balancing pool for HTTP/1.1 and HTTP/2 #3505

jrudolph commented Oct 7, 2020

Marcus-Rosti commented Oct 19, 2020

Marcus-Rosti commented Oct 19, 2020

jrudolph commented Oct 20, 2020

Marcus-Rosti commented Oct 20, 2020

ignasi35 commented Oct 20, 2020

ignasi35 commented Oct 20, 2020

Load balancing pool for HTTP/1.1 and HTTP/2 #3505

Load balancing pool for HTTP/1.1 and HTTP/2 #3505

Comments

jrudolph commented Oct 7, 2020

Background Information

Akka HTTP Implementation Ideas

Marcus-Rosti commented Oct 19, 2020

Marcus-Rosti commented Oct 19, 2020

jrudolph commented Oct 20, 2020

Marcus-Rosti commented Oct 20, 2020

ignasi35 commented Oct 20, 2020

ignasi35 commented Oct 20, 2020