[vtadmin] Update vtctld dialer to validate connectivity #9915

doeg · 2022-03-18T16:51:17Z

Description

This fixes #9422: prior to this fix, vtadmin-api will hang on to a cached gRPC connection a vtctld even after the gRPC channel is shut down, and any subsequent vtadmin-api request that queries a vtctld (e.g., /api/schemas) will fail.

This change updates VTAdmin's vtctld proxy to be "self healing" when its gRPC connection is lost. The Dial function, which is called prior to any vtctld request, will now wait until its cached connection is READY. If this check fails (for example: the connection is in a SHUTDOWN state as mentioned above, or we exceed grpc-connectivity-timeout's worth of waiting), then the proxy will discover a different vtctld in that cluster and attempt to establish (and cache) a new gRPC connection.

I also added a bunch more logging in the Dial function... I've found the added verbosity to be useful, but let me know if I took it too far (lol).

FWIW we've had this branch running across Slack's Vitess deployments for the past few months and its worked well. And thank you @ajm188 for writing most of this with me back in... uh.... last July. :')

Reproduction steps

Some of this is noted in #9422, but I'll note it here for posterity anyway. (It's an interesting example of doing mildly nontrivial stuff with VTAdmin + the local example. 🤷)

Parameterize the vtctld-up.sh script and VTAdmin's discovery.json file to make it a lil easier to run a second vtctld

View diff

diff --git a/examples/local/scripts/vtctld-up.sh b/examples/local/scripts/vtctld-up.sh
index db6e544230..b81134c477 100755
--- a/examples/local/scripts/vtctld-up.sh
+++ b/examples/local/scripts/vtctld-up.sh
@@ -18,8 +18,8 @@

source ./env.sh

-cell=${CELL:-'test'}
-grpc_port=15999
+grpc_port=${VTCTLD_GRPC_PORT:-'15999'}
+web_port=${VTCTLD_WEB_PORT:-'15000'}

echo "Starting vtctld..."
# shellcheck disable=SC2086
@@ -32,7 +32,7 @@ vtctld \
--backup_storage_implementation file \
--file_backup_storage_root $VTDATAROOT/backups \
--log_dir $VTDATAROOT/tmp \
- --port $vtctld_web_port \
+ --port $web_port \
--durability_policy 'semi_sync' \
--grpc_port $grpc_port \
--pid_file $VTDATAROOT/tmp/vtctld.pid \
diff --git a/examples/local/vtadmin/discovery.json b/examples/local/vtadmin/discovery.json
index def7dd50f8..6b29f0077c 100644
--- a/examples/local/vtadmin/discovery.json
+++ b/examples/local/vtadmin/discovery.json
@@ -5,6 +5,12 @@
                "fqdn": "localhost:15000",
                "hostname": "localhost:15999"
            }
+        },
+        {
+            "host": {
+                "fqdn": "localhost:16000",
+                "hostname": "localhost:16999"
+            }
        }
    ],
    "vtgates": [

Start up a local cluster as usual, which will start up a single vtctld on http://localhost:15999: ./101_initial_cluster.sh
Start a second vtctld on http://localhost:16999: VTCTLD_GRPC_PORT=16999 VTCTLD_WEB_PORT=16000 ./scripts/vtctld-up.sh
Start up VTAdmin. (The usual way is ./scripts/vtadmin-up.sh, which will also start vtadmin-web.)

At this point, we can double check that VTAdmin can "discover" both vtctlds. (Scare quotes since "discovery", in this case, is simply reading from that discovery.json file.)

 $ curl "http://localhost:14200/api/vtctlds"

{"result":{"vtctlds":[{"hostname":"localhost:15999","cluster":{"id":"local","name":"local"},"FQDN":"localhost:15000"}]},"ok":true}

Now, since VTAdmin lazy-initializes its vtctld connections, we need to trigger a request that traverses the "discover -> dial -> cache" codepath:

# We don't really care about the output right now
curl "http://localhost:14200/api/schemas"

Examine VTAdmin's proxy.go logs to see which of the two local vtctlds it discovered + dialed; in this case, it's the vtcltd on http://localhost:16999.

I0318 12:46:54.444487   43118 config.go:122] [rbac]: loaded authorizer with 1 rules
I0318 12:46:54.444526   43118 config.go:146] [rbac]: no authenticator implementation specified
I0318 12:46:54.449496   43118 server.go:240] server vtadmin listening on :14200
I0318 12:49:37.128481   43118 vtsql.go:175] Dialing localhost:15991 ...
2022-03-18 12:49:37     INFO proxy.go:136] Discovering vtctld to dial...

2022-03-18 12:49:37     INFO proxy.go:156] Discovered vtctld localhost:16999; attempting to establish gRPC connection...

2022-03-18 12:49:37     INFO proxy.go:162] Established gRPC connection to vtctld localhost:16999; waiting to transition to READY...

2022-03-18 12:49:37     INFO proxy.go:175] Established gRPC connection to vtctld localhost:16999

2022-03-18 12:49:37     INFO proxy.go:113] Using cached connection to vtctld localhost:16999

Now, we are going to kill this vtctld. 😈

kill $(ps aux | grep vtctld | grep 16999 | awk '{print $2}')

For the sake of illustration, let's take brief diversion into 🐛 bug territory 🐛 and see what happens on the main branch after we kill the vtctld (without the WaitForReady fix, but keeping all the logging): the curl command will eventually time out, since the request never completes.

`curl "http://localhost:14200/api/schemas"`

Type 'dlv help' for list of commands.
I0318 12:56:29.962070   45022 config.go:122] [rbac]: loaded authorizer with 1 rules
I0318 12:56:29.962096   45022 config.go:146] [rbac]: no authenticator implementation specified
I0318 12:56:29.964926   45022 server.go:240] server vtadmin listening on :14200
I0318 12:56:42.636499   45022 vtsql.go:175] Dialing localhost:15991 ...
2022-03-18 12:56:42     INFO proxy.go:136] Discovering vtctld to dial...

2022-03-18 12:56:42     INFO proxy.go:156] Discovered vtctld localhost:16999; attempting to establish gRPC connection...

2022-03-18 12:56:42     INFO proxy.go:175] Established gRPC connection to vtctld localhost:16999

2022-03-18 12:56:42     INFO proxy.go:113] Using cached connection to vtctld localhost:16999

W0318 12:56:58.812701   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 12:56:59.815187   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 12:57:01.442810   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
I0318 12:57:02.719835   45022 vtsql.go:147] Have valid connection to localhost:15991, reusing it.
2022-03-18 12:57:02     INFO proxy.go:113] Using cached connection to vtctld localhost:16999

W0318 12:57:03.778020   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 12:57:08.292044   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 12:57:14.125623   45022 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...

And now for the fix! We can observe that VTAdmin is able to detect the SHUTDOWN connection and negotiate a new one:

2022-03-18 13:13:49     INFO proxy.go:127] Closing stale connection to vtctld localhost:16999

2022-03-18 13:13:49     INFO proxy.go:139] Discovering vtctld to dial...

2022-03-18 13:13:49     INFO proxy.go:159] Discovered vtctld localhost:16999; attempting to establish gRPC connection...

2022-03-18 13:13:49     INFO proxy.go:165] Established gRPC connection to vtctld localhost:16999; waiting to transition to READY...

2022-03-18 13:13:51     INFO proxy.go:174] Could not transition to READY state for gRPC connection to localhost:16999: failed to transition from state TRANSIENT_FAILURE

2022-03-18 13:13:55     INFO proxy.go:127] Closing stale connection to vtctld localhost:16999

I0318 13:13:55.704301   48623 log.go:255] error closing possibly-stale connection before re-dialing: %!w(*status.Error=&{0xc0001bc960})
2022-03-18 13:13:55     INFO proxy.go:139] Discovering vtctld to dial...

2022-03-18 13:13:55     INFO proxy.go:159] Discovered vtctld localhost:15999; attempting to establish gRPC connection...

2022-03-18 13:13:55     INFO proxy.go:165] Established gRPC connection to vtctld localhost:15999; waiting to transition to READY...

2022-03-18 13:13:55     INFO proxy.go:178] Established gRPC connection to vtctld localhost:15999

2022-03-18 13:13:55     INFO proxy.go:113] Using cached connection to vtctld localhost:15999

The above logs are especially interesting because they point out a ~~shortcoming of~~ a later enhancement for this change. On VTAdmin's first attempt to discover a vtctld, it rediscovers the one we just killed on http://localhost:16999. This is because we're using static file discovery, and so the vtctld is never removed from that discovery.json file after we kill it, so VTAdmin has a 50% chance of rediscovering it again. And, as mentioned below, Dial does not retry in this case (although one could imagine it doing so with an expontential backoff or similar), so the request fails:

curl "http://localhost:14200/api/schemas"                  
{"error":{"message":"failed to transition from state TRANSIENT_FAILURE","code":"unknown"},"ok":false}

The rest of the logs result from a subsequent curl "http://localhost:14200/api/schemas" which (you guessed it!) redials, rediscovers, and re-establishes a gRPC connection to the remaining, healthy vtcltd on http://localhost:15999.

A note on rejected alternatives

Using the connectivity API to introspect our gRPC connections is a little cumbersome and possibly error prone. (I have been known to write bugs and... gestures at next section on leaked connections.)

Ideally the go-grpc library would handle this for us and theoretically it can, however my understanding is that we'd use the Resolver interface as a service discovery integration point and then run something like a lookaside load balancer. This has its advantages (round robin discovery, "officially supported")... but it would also be a Whole Thing to rewrite our service discovery layer.

Another approach that was shared with me is initializing healthchecks on the connection you get back from Dial: https://github.com/grpc/grpc/blob/master/doc/health-checking.md. I haven't investigated this one (to be candid, since this branch works) but I'll note it here for posterity!

A note on leaked connections

This PR updates proxy.go to functionally ignore errors from closing the gRPC connection. There are definitely some... undesirable interactions between this retry logic and gRPC's internal retry logic.

When the gRPC connection is lost, _even if we call Close, gRPC's internal mechanisms will continue to retry for ~4 seconds. During this time, as far as I can tell, the connectivity API will show the connection flapping between CONNECTING and TRANSIENT_FAILURE.

During this period, any VTAdmin request that traverses the Dial codepath will first fail to transition, and then all subsequent calls will fail since Dial (prior to this branch) will return early on that "error closing possibly stale connection" error, until gRPC's internal retry times out:

~/workspace/vitess/examples/local 🍕 $  curl "http://localhost:14200/api/schemas"                  
{"error":{"message":"failed to transition from state TRANSIENT_FAILURE","code":"unknown"},"ok":false}%                                                      
~/workspace/vitess/examples/local 🍕 $  curl "http://localhost:14200/api/schemas"
{"error":{"message":"error closing possibly-stale connection before re-dialing: rpc error: code = Canceled desc = grpc: the client connection is closing","code":"unknown"},"ok":false}%                                                                                                                                
~/workspace/vitess/examples/local 🍕 $  curl "http://localhost:14200/api/schemas"
{"error":{"message":"error closing possibly-stale connection before re-dialing: rpc error: code = Canceled desc = grpc: the client connection is closing","code":"unknown"},"ok":false}%                                                                                                                                
~/workspace/vitess/examples/local 🍕 $  curl "http://localhost:14200/api/schemas"
{"error":{"message":"error closing possibly-stale connection before re-dialing: rpc error: code = Canceled desc = grpc: the client connection is closing","code":"unknown"},"ok":false}%                                                                                                                                
~/workspace/vitess/examples/local 🍕 $  curl "http://localhost:14200/api/schemas"
{"error":{"message":"error closing possibly-stale connection before re-dialing: rpc error: code = Canceled desc = grpc: the client connection is closing","code":"unknown"},"ok":false}%

We can also see evidence of these retries in the logs as soon as the vtctld is killed:

Type 'dlv help' for list of commands.
I0318 13:33:43.183080   55452 config.go:122] [rbac]: loaded authorizer with 1 rules
I0318 13:33:43.183104   55452 config.go:146] [rbac]: no authenticator implementation specified
I0318 13:33:43.186368   55452 server.go:240] server vtadmin listening on :14200
I0318 13:34:51.526821   55452 vtsql.go:175] Dialing localhost:15991 ...
2022-03-18 13:34:51     INFO proxy.go:139] Discovering vtctld to dial...

2022-03-18 13:34:51     INFO proxy.go:159] Discovered vtctld localhost:16999; attempting to establish gRPC connection...

2022-03-18 13:34:51     INFO proxy.go:165] Established gRPC connection to vtctld localhost:16999; waiting to transition to READY...

2022-03-18 13:34:51     INFO proxy.go:178] Established gRPC connection to vtctld localhost:16999

2022-03-18 13:34:51     INFO proxy.go:113] Using cached connection to vtctld localhost:16999

W0318 13:34:56.481484   55452 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 13:34:57.484005   55452 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...
W0318 13:34:59.223662   55452 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:16999 localhost:16999 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:16999: connect: connection refused". Reconnecting...

I did a bunch of digging into this a few months ago (which is part of the reason this branch has taken me forever 😭) and realize that we can likely disable gRPC's internal retries with some incantation of grpc.DialOptions. I remember in a past conversation with @ajm188 that configuring dial opts was more complicated than I anticipated, and... well, I'd propose addressing that in a separate PR. :')

My understanding of the "worst case" scenario, as noted, is that VTAdmin can possibly leak gRPC connections that are improperly closed. In most cases, I think (hope?) these connections would terminate themselves once their retry timeout is up. There is a chance, though, that the once-dead vtctld comes back while gRPC is internally retrying, even if VTAdmin's proxy has since established a connection to a different vtctld.

FWIW, we've been running this change in our environment for several months without any issues. And this change is an enhancement given that the current behaviour is to simply fail forever until the vtadmin process is restarted. :')

Related Issue(s)

Closes #9422

Checklist

Should this PR be backported? No
Tests were added or are not required
Documentation was added or is not required

Deployment Notes

This PR introduces grpc-connectivity-timeout, a new per-cluster config option that sets the maxmium wait time to establish a gRPC connection between VTAdmin and the vtctld it queries in that region. The default value is 2 seconds.

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

…estRedial Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg · 2022-03-18T17:43:12Z

go/vt/vtadmin/vtctldclient/proxy.go

+ // Even if the client connection does not shut down cleanly, we don't want to block
+ // Dial from discovering a new vtctld. This makes VTAdmin's dialer more resilient,
+ // but, as a caveat, it _can_ potentially leak improperly-closed gRPC connections.
+ log.Errorf("error closing possibly-stale connection before re-dialing: %w", err)


I added more context about the leaked connections thing in the PR description.

ajm188

looking good, a couple things to address (you'll also need to add labels to the PR)

go/vt/vtadmin/vtctldclient/proxy.go

ajm188 · 2022-03-18T17:48:48Z

go/vt/vtadmin/vtctldclient/proxy.go

 }
 }

+ log.Infof("Discovering vtctld to dial...\n")


Do we want to keep these, or were they just to help test/debug?

I also added a bunch more logging in the Dial function... I've found the added verbosity to be useful, but let me know if I took it too far (lol).

Up to you!

I removed most of the additional log statements.

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg · 2022-03-21T14:40:37Z

It looks like there was an actual regression in a couple of the unit tests. Will fix.

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg · 2022-03-21T19:31:40Z

Two things to fix the regressions:

fef9e98 and afe15d9 move the 2 second ConnectivityTimeout default to a variable so it can be used in unit tests
5c8d160 adds WaitForReady to the fakevtctldclient implementation since it was segfaulting. (I missed this when adding it to localvtctldclient 🤔)

All tests are passing now.

@ajm188 no rush, but would you mind taking another quick look at 0ae0c1a...afe15d9 before this is merged? I'd like to confirm there aren't any weird gotchas with using a var like that.

As an aside + mostly a note to self, I noticed a bunch of gRPC reconnect logs in the failed test output that I thought may be related to (or worsened by) this change, but it happens on the main branch too:

$ /usr/local/go/bin/go test -timeout 30s -run ^TestDial$ vitess.io/vitess/go/vt/vtadmin/vtctldclient -v -count=10
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
W0321 15:23:14.898187     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59824 127.0.0.1:59824 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59824: connect: connection refused". Reconnecting...
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
=== RUN   TestDial
--- PASS: TestDial (0.00s)
W0321 15:23:14.899421     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59837 127.0.0.1:59837 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59837: connect: connection refused". Reconnecting...
W0321 15:23:14.898731     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59828 127.0.0.1:59828 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59828: connect: connection refused". Reconnecting...
=== RUN   TestDial
W0321 15:23:14.898800     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59829 127.0.0.1:59829 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59829: connect: connection refused". Reconnecting...
W0321 15:23:14.898965     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59832 127.0.0.1:59832 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59832: connect: connection refused". Reconnecting...
W0321 15:23:14.899138     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59834 127.0.0.1:59834 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59834: connect: connection refused". Reconnecting...
W0321 15:23:14.899272     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59835 127.0.0.1:59835 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59835: connect: connection refused". Reconnecting...
--- PASS: TestDial (0.00s)
W0321 15:23:14.899590     412 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:59840 127.0.0.1:59840 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:59840: connect: connection refused". Reconnecting...
PASS
ok  	vitess.io/vitess/go/vt/vtadmin/vtctldclient	0.794s

I think this is another example of the thing I mentioned in the PR description given gRPCs internal retry mechanism + our current dial options... I can take a stab at fixing that over the next few weeks in a separate branch by making that dial option configurable. (Open to other suggestions, of course.)

ajm188 · 2022-03-22T02:00:13Z

go/vt/vtadmin/vtctldclient/config.go

 }

+var defaultConnectivityTimeout = 2 * time.Second


For your question around potential gotchas, this should be fine. You can also make this a const since I can't think of a reason we would ever modify the default at runtime.

Sure, I can change that. What's another 4 hours of CI runs between friends?

ajm188 · 2022-03-22T10:28:14Z

go/vt/vtadmin/vtctldclient/proxy_test.go

+ return listener, server, err
+}
+
+func TestDial(t *testing.T) {


what you're seeing running just this test in isolation is super interesting, do you mind filing an issue and i can dig into it? i'm not really sure what's going, but it could be a "macs suck at local networking" issue, or something not completely correct in the code, but it's hard to say (and i don't think we should block this PR, which I still maintain is strictly an improvement over the current, uh, Situation)

Done: #9943

Thanks for taking a look. I'm super interested in what you find! I spent way too long looking at gRPC internals for this PR 😭

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg added 11 commits March 18, 2022 12:03

Add WaitForReady to vtctldclient interface

5f90c54

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Implement WaitForReady in grpcvtctldclient

11761ee

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Call WaitForReady in VTAdmin's vtctld proxy + add a test

f5ec764

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Nit: typo

f758a15

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Add a grpc-connectivity-timeout config flag

3e35df9

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Use WaitForReady instead of time.Sleep to detect client shutdown in T…

f078bcb

…estRedial Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Fix TestDial by adding ConnectivityTimeout option

1a4fcf9

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Dedupe test logic with initVtctlServer helper

bed75c2

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Add WaitForReady tests in grpcvtctldclient/client_test.go

9fc1c2e

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Nits: wording + grammar

9215e39

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Don't early return from Dial on Close errors

d8318d6

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg marked this pull request as ready for review March 18, 2022 17:41

doeg requested review from ajm188, notfelineit, deepthi and rohit-nayak-ps as code owners March 18, 2022 17:41

doeg commented Mar 18, 2022

View reviewed changes

ajm188 reviewed Mar 18, 2022

View reviewed changes

doeg added Component: VTAdmin VTadmin interface release notes none Type: Bug labels Mar 18, 2022

doeg added 2 commits March 18, 2022 14:00

Import the correct logging framework >:(

5742703

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Remove extraneous log statements

0ae0c1a

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

ajm188 approved these changes Mar 21, 2022

View reviewed changes

doeg added 3 commits March 21, 2022 11:02

Add defaultConnectivityTimeout var for use in unit tests

fef9e98

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Add WaitForReady to fakevtctldclient

5c8d160

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

Use defaultConnectivityTimeout in proxy_test

afe15d9

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

ajm188 approved these changes Mar 22, 2022

View reviewed changes

var defaultConnectivityTimeout -> const

92e7055

Signed-off-by: Sara Bee <855595+doeg@users.noreply.github.com>

doeg mentioned this pull request Mar 22, 2022

[vtadmin] grpcvtctldclient attempts to re-establish a connection even after a unit test has completed #9943

Closed

doeg merged commit 1b59109 into vitessio:main Mar 22, 2022

doeg deleted the sarabee-vtadmin-vtctld-connectivity branch March 22, 2022 14:54

ajm188 mentioned this pull request Mar 30, 2022

[vtadmin] custom discovery resolver #9977

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vtadmin] Update vtctld dialer to validate connectivity #9915

[vtadmin] Update vtctld dialer to validate connectivity #9915

doeg commented Mar 18, 2022 •

edited

Loading

doeg Mar 18, 2022

ajm188 left a comment

ajm188 Mar 18, 2022

doeg Mar 18, 2022

doeg Mar 21, 2022

doeg commented Mar 21, 2022

doeg commented Mar 21, 2022 •

edited

Loading

ajm188 Mar 22, 2022

doeg Mar 22, 2022

ajm188 Mar 22, 2022

doeg Mar 22, 2022

[vtadmin] Update vtctld dialer to validate connectivity #9915

[vtadmin] Update vtctld dialer to validate connectivity #9915

Conversation

doeg commented Mar 18, 2022 • edited Loading

Description

Reproduction steps

A note on rejected alternatives

A note on leaked connections

Related Issue(s)

Checklist

Deployment Notes

Choose a reason for hiding this comment

ajm188 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

doeg commented Mar 21, 2022

doeg commented Mar 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

doeg commented Mar 18, 2022 •

edited

Loading

doeg commented Mar 21, 2022 •

edited

Loading