-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Milestone
Description
With our sue of gRPC Java across Google Compute Engine (GCE) L3 Load Balancers (Network Load Balancers), we seem to be hitting similar issues we had with gRPC in Go:
grpc/grpc-go#536
Basically Google L3 load balancers silently drop long-lasting TCP connections after 600 seconds.
While we were able to work around the issue by specifying a custom Dialer in Go:
func WithKeepAliveDialer() grpc.DialOption {
return grpc.WithDialer(func(addr string, timeout time.Duration) (net.Conn, error) {
d := net.Dialer{Timeout: timeout, KeepAlive: *flagGrpcClientKeepAliveDuration}
return d.Dial("tcp", addr)
})
}There seems to be no way of overriding the KeepAlive peridods for NettyClientTransport. We know it's possible to set the keep alive period in the kernel of the machines, but that's a bit of a stretch to expect the user-code programmers to know about it.
Can we either:
- have the ability to specify the TCP keep alive period on create of channel
- documentation around it, especially how it can cause hard-to-debug problems on GCE?
cc @ejona86 since he seems to have had opinions about it in #737
rusenask, codyfyi, gnomeria, baptr, hongdanyang1991 and 1 more