Skip to content

Commit

Permalink
Add timeout to connection.Connect()
Browse files Browse the repository at this point in the history
If connection.Connect() gets stuck it will block infinitely, never
reporting an error to the library's consumer.

We have received reports of this happening even when a _valid_ address
is passed in node-driver-registrar because of a race condition involving
pod restarts. Specifically, there is an unlucky sequence of events when
restarting both the CSI Driver and node-driver-registrar, wheere the
node-driver-registar will attempt to Connect() to the CSI Driver, but
will get stuck doing so on an old file descriptor from the previously
running CSI Driver (and thus, get stuck infinitely).

There is no mechanism to pass a connection timeout to Connect, so this
commit adds a reasonbly long default timeout so these cases will
eventually return an error rather than getting stuck infinitely.

Signed-off-by: Connor Catlett <conncatl@amazon.com>
  • Loading branch information
ConnorJC3 committed May 10, 2023
1 parent 597d128 commit 55c9d4c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion connection/connection.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ func SetMaxGRPCLogLength(characterCount int) {
// For other connections, the default behavior from gRPC is used and
// loss of connection is not detected reliably.
func Connect(address string, metricsManager metrics.CSIMetricsManager, options ...Option) (*grpc.ClientConn, error) {
return connect(address, metricsManager, []grpc.DialOption{}, options)
return connect(address, metricsManager, []grpc.DialOption{grpc.WithTimeout(time.Second * 30)}, options)
}

// Option is the type of all optional parameters for Connect.
Expand Down

0 comments on commit 55c9d4c

Please sign in to comment.