Add gRPC MaxConnectionAge & MaxConnectionAgeGrace Options #512
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Add gRPC
MaxConnectionAge
&MaxConnectionAgeGrace
options.This is to avoid "split-brain" of long-lived gRPC stream connection when performing hash-based load balancing in Yorkie cluster.
For more information, follow: envoyproxy/envoy#26459
gRPC suggest to use
MAX_CONNECTION_AGE
to avoid long-lived RPC issue on load balancing(https://youtu.be/Naonb2XD_2Q?t=136), but stream is not actually closed evenMAX_CONNECTION_AGE
is configured.This is because after
MAX_CONNECTION_AGE
, server sends GoAway to client, but GoAway is not a signal to close connection instantly. It's just a signal to tell client not to send additional request to server(grpc/grpc-java#8770)Therefore,
MAX_CONNECTION_AGE_GRACE
is also introduced to forcibly close stream.So this is how it works:
maxConnectionAge
.maxConnectionAge
, GoAway frame is sent to client to gracefully close the stream. But stream is not closed at this point.MaxConnectionAgeGrace
, stream is forcibly closed.Reference: https://pkg.go.dev/google.golang.org/grpc/keepalive
Hence,
maxConnectionAge
andmaxConnectionAgeGrace
options are added as gRPC server's keepalive option parameters.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
As I researched some information about long-lived connection, graceful close, etc, I now have good explanation about why
close_connections_on_host_set_change
(gracefully draining connections) are not working for long-lived gRPC connections.It is because envoy sends HTTP2 GoAway frame on draining sequence, but GoAway is not for instant connection close. Therefore, no connection close are made after draining, and
close_connections_on_host_set_change
option will not work as well.References: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/draining
Also, I found some interesting issue related to our issue on long-lived connection close: grpc/grpc#26703.
Maybe it will be useful to introduce
force_close_connections_on_host_set_change
on envoy:: envoyproxy/envoy#26459Does this PR introduce a user-facing change?:
Additional documentation:
Checklist: