Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add failover test cases #13737

Merged
merged 1 commit into from
Feb 25, 2022
Merged

Add failover test cases #13737

merged 1 commit into from
Feb 25, 2022

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Feb 24, 2022

Fix 13707

Add failover test to cover the following two scenarios,

  • client failover when the first server listed goes unavailable after the client has established a connection
  • client failover when the first server listed is unavailable when the client is first created

cc @ptabor @serathius @spzala

@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2022

Codecov Report

Merging #13737 (3fc787b) into main (a7106dc) will decrease coverage by 0.14%.
The diff coverage is n/a.

❗ Current head 3fc787b differs from pull request most recent head 833b0e9. Consider uploading reports for the commit 833b0e9 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #13737      +/-   ##
==========================================
- Coverage   72.78%   72.63%   -0.15%     
==========================================
  Files         467      467              
  Lines       38287    38287              
==========================================
- Hits        27867    27811      -56     
- Misses       8617     8669      +52     
- Partials     1803     1807       +4     
Flag Coverage Δ
all 72.63% <ø> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/proxy/grpcproxy/register.go 69.76% <0.00%> (-9.31%) ⬇️
client/v3/namespace/watch.go 87.87% <0.00%> (-6.07%) ⬇️
server/storage/mvcc/watchable_store.go 84.42% <0.00%> (-5.44%) ⬇️
server/etcdserver/api/v3rpc/lease.go 77.21% <0.00%> (-5.07%) ⬇️
raft/rafttest/node.go 95.00% <0.00%> (-5.00%) ⬇️
server/storage/schema/cindex.go 95.34% <0.00%> (-4.66%) ⬇️
client/v3/leasing/cache.go 87.77% <0.00%> (-3.89%) ⬇️
api/etcdserverpb/raft_internal_stringer.go 76.78% <0.00%> (-3.58%) ⬇️
server/etcdserver/api/v3rpc/util.go 70.96% <0.00%> (-3.23%) ⬇️
server/etcdserver/api/v3rpc/member.go 93.54% <0.00%> (-3.23%) ⬇️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7106dc...833b0e9. Read the comment docs.

Copy link
Contributor

@ptabor ptabor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you.

@ahrtr ahrtr force-pushed the add_failover_test2 branch 3 times, most recently from ce9db35 to 833b0e9 Compare February 25, 2022 00:09
@ahrtr
Copy link
Member Author

ahrtr commented Feb 25, 2022

Resolved all the comments. We can ignore the semaphoreci failure, which was caused by the docker hub rate limitation . cc @serathius @ptabor @spzala

@ahrtr ahrtr force-pushed the add_failover_test2 branch from 833b0e9 to 9b6681f Compare February 25, 2022 01:55
@ahrtr
Copy link
Member Author

ahrtr commented Feb 25, 2022

All test are green now after retriggering the pipeline.

@ptabor ptabor merged commit f4f5ae7 into etcd-io:main Feb 25, 2022

// Launch an etcd cluster with 3 members
t.Logf("Launching an etcd cluster with 3 members [%s]", tc.name)
clus := integration2.NewCluster(t, &integration2.ClusterConfig{Size: 3, ClientTLS: &integration2.TestTLSInfo})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this TLS config match what is expected in production w.r.t. validity of serving certificates and distinct hostnames (or is it using localhost / identical loopback IPs for all members)? I have memories of the etcd client not handling TLS validation on failover correctly (e.g. #10911) and wanted to make sure we're actually exercising the scenario

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point, thanks for pointing out this.

It's using localhost & 127.0.0.1 (see below). The e2e & integration tests are supposed to be automatically executed in pipeline, so all etcd members are always running on the same host?

Probably we need to think about how to run & test etcd in an environment which is close to production? cc @ptabor @serathius @spzala

        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                F4:B1:45:B9:C4:18:A5:74:3B:5A:C9:21:10:60:4F:6F:5F:EF:12:64
            X509v3 Authority Key Identifier: 
                keyid:29:26:2B:F4:ED:D9:18:F5:60:74:CE:00:7A:C0:E6:FF:1C:C8:BC:A0

            X509v3 Subject Alternative Name: 
                DNS:localhost, IP Address:127.0.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add tests for correct TLS verification against multi-server etcd cluster
4 participants