Skip to content

Add KeepAlive support #1648

@mwitkow

Description

@mwitkow

With our sue of gRPC Java across Google Compute Engine (GCE) L3 Load Balancers (Network Load Balancers), we seem to be hitting similar issues we had with gRPC in Go:
grpc/grpc-go#536

Basically Google L3 load balancers silently drop long-lasting TCP connections after 600 seconds.

While we were able to work around the issue by specifying a custom Dialer in Go:

func WithKeepAliveDialer() grpc.DialOption {
    return grpc.WithDialer(func(addr string, timeout time.Duration) (net.Conn, error) {
        d := net.Dialer{Timeout: timeout, KeepAlive: *flagGrpcClientKeepAliveDuration}
        return d.Dial("tcp", addr)
    })
}

There seems to be no way of overriding the KeepAlive peridods for NettyClientTransport. We know it's possible to set the keep alive period in the kernel of the machines, but that's a bit of a stretch to expect the user-code programmers to know about it.

Can we either:

  • have the ability to specify the TCP keep alive period on create of channel
  • documentation around it, especially how it can cause hard-to-debug problems on GCE?

cc @ejona86 since he seems to have had opinions about it in #737

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions