Skip to content

Commit

Permalink
Merge pull request #10730 from jingyih/learner_part3
Browse files Browse the repository at this point in the history
*: support raft learner in etcd - part 3
  • Loading branch information
xiang90 authored May 29, 2019
2 parents bdcecd1 + 23511d2 commit 77e1c37
Show file tree
Hide file tree
Showing 17 changed files with 804 additions and 137 deletions.
70 changes: 70 additions & 0 deletions Documentation/op-guide/runtime-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,46 @@ The new member will run as a part of the cluster and immediately begin catching

If adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members. If adding a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. This behavior only happens between the time `etcdctl member add` informs the cluster about the new member and the new member successfully establishing a connection to the existing one.

#### Add a new member as learner

Starting from v3.4, etcd supports adding a new member as learner / non-voting member.
The motivation and design can be found in [design doc](https://etcd.readthedocs.io/en/latest/server-learner.html).
In order to make the process of adding a new member safer,
and to reduce cluster downtime when the new member is added, it is recommended that the new member is added to cluster
as a learner until it catches up. This can be described as a three step process:

* Add the new member as learner via [gRPC members API][member-api-grpc] or the `etcdctl member add --learner` command.

* Start the new member with the new cluster configuration, including a list of the updated members (existing members + the new member).
This step is exactly the same as before.

* Promote the newly added learner to voting member via [gRPC members API][member-api-grpc] or the `etcdctl member promote` command.
etcd server validates promote request to ensure its operational safety.
Only after its raft log has caught up to leader’s can learner be promoted to a voting member.
If a learner member has not caught up to leader's raft log, member promote request will fail
(see [error cases when promoting a member] section for more details).
In this case, user should wait and retry later.

In v3.4, etcd server limits the number of learners that cluster can have to one. The main consideration is to limit the
extra workload on leader due to propagating data from leader to learner.

Use `etcdctl member add` with flag `--learner` to add new member to cluster as learner.

```sh
$ etcdctl member add infra3 --peer-urls=http://10.0.1.13:2380 --learner
Member 9bf1b35fc7761a23 added to cluster a7ef944b95711739

ETCD_NAME="infra3"
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE=existing
```

After new etcd process is started for the newly added learner member, use `etcdctl member promote` to promote learner to voting member.
```
$ etcdctl member promote 9bf1b35fc7761a23
Member 9e29bbaa45d74461 promoted in cluster a7ef944b95711739
```

#### Error cases when adding members

In the following case a new host is not included in the list of enumerated nodes. If this is a new cluster, the node must be added to the list of initial cluster members.
Expand Down Expand Up @@ -153,6 +193,35 @@ etcd: this member has been permanently removed from the cluster. Exiting.
exit 1
```
#### Error cases when adding a learner member
Cannot add learner to cluster if the cluster already has 1 learner (v3.4).
```
$ etcdctl member add infra4 --peer-urls=http://10.0.1.14:2380 --learner
Error: etcdserver: too many learner members in cluster
```
#### Error cases when promoting a learner member
Learner can only be promoted to voting member if it is in sync with leader.
```
$ etcdctl member promote 9bf1b35fc7761a23
Error: etcdserver: can only promote a learner member which is in sync with leader
```
Promoting a member that is not a learner will fail.
```
$ etcdctl member promote 9bf1b35fc7761a23
Error: etcdserver: can only promote a learner member
```
Promoting a member that does not exist in cluster will fail.
```
$ etcdctl member promote 12345abcde
Error: etcdserver: member not found
```
### Strict reconfiguration check mode (`-strict-reconfig-check`)
As described in the above, the best practice of adding new members is to configure a single member at a time and verify it starts correctly before adding more new members. This step by step approach is very important because if newly added members is not configured correctly (for example the peer URLs are incorrect), the cluster can lose quorum. The quorum loss happens since the newly added member are counted in the quorum even if that member is not reachable from other existing members. Also quorum loss might happen if there is a connectivity issue or there are operational issues.
Expand All @@ -173,3 +242,4 @@ It is enabled by default.
[member migration]: ../v2/admin_guide.md#member-migration
[remove member]: #remove-a-member
[runtime-reconf]: runtime-reconf-design.md
[error cases when promoting a member]: #error-cases-when-promoting-a-learner-member
94 changes: 83 additions & 11 deletions clientv3/integration/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"reflect"
"strings"
"testing"
"time"

"go.etcd.io/etcd/integration"
"go.etcd.io/etcd/pkg/testutil"
Expand Down Expand Up @@ -214,14 +215,19 @@ func TestMemberAddForLearner(t *testing.T) {
}
}

func TestMemberPromoteForLearner(t *testing.T) {
// TODO test not ready learner promotion.
func TestMemberPromote(t *testing.T) {
defer testutil.AfterTest(t)

clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 3})
defer clus.Terminate(t)
// TODO change the random client to client that talk to leader directly.
capi := clus.RandClient()

// member promote request can be sent to any server in cluster,
// the request will be auto-forwarded to leader on server-side.
// This test explicitly includes the server-side forwarding by
// sending the request to follower.
leaderIdx := clus.WaitLeader(t)
followerIdx := (leaderIdx + 1) % 3
capi := clus.Client(followerIdx)

urls := []string{"http://127.0.0.1:1234"}
memberAddResp, err := capi.MemberAddAsLearner(context.Background(), urls)
Expand All @@ -244,18 +250,84 @@ func TestMemberPromoteForLearner(t *testing.T) {
t.Fatalf("Added 1 learner node to cluster, got %d", numberOfLearners)
}

memberPromoteResp, err := capi.MemberPromote(context.Background(), learnerID)
if err != nil {
t.Fatalf("failed to promote member: %v", err)
// learner is not started yet. Expect learner progress check to fail.
// As the result, member promote request will fail.
_, err = capi.MemberPromote(context.Background(), learnerID)
expectedErrKeywords := "can only promote a learner member which is in sync with leader"
if err == nil {
t.Fatalf("expecting promote not ready learner to fail, got no error")
}
if !strings.Contains(err.Error(), expectedErrKeywords) {
t.Fatalf("expecting error to contain %s, got %s", expectedErrKeywords, err.Error())
}

// create and launch learner member based on the response of V3 Member Add API.
// (the response has information on peer urls of the existing members in cluster)
learnerMember := clus.MustNewMember(t, memberAddResp)
clus.Members = append(clus.Members, learnerMember)
if err := learnerMember.Launch(); err != nil {
t.Fatal(err)
}

// retry until promote succeed or timeout
timeout := time.After(5 * time.Second)
for {
select {
case <-time.After(500 * time.Millisecond):
case <-timeout:
t.Errorf("failed all attempts to promote learner member, last error: %v", err)
break
}

_, err = capi.MemberPromote(context.Background(), learnerID)
// successfully promoted learner
if err == nil {
break
}
// if member promote fails due to learner not ready, retry.
// otherwise fails the test.
if !strings.Contains(err.Error(), expectedErrKeywords) {
t.Fatalf("unexpected error when promoting learner member: %v", err)
}
}
}

numberOfLearners = 0
for _, m := range memberPromoteResp.Members {
// TestMaxLearnerInCluster verifies that the maximum number of learners allowed in a cluster is 1
func TestMaxLearnerInCluster(t *testing.T) {
defer testutil.AfterTest(t)

// 1. start with a cluster with 3 voting member and 0 learner member
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 3})
defer clus.Terminate(t)

// 2. adding a learner member should succeed
resp1, err := clus.Client(0).MemberAddAsLearner(context.Background(), []string{"http://127.0.0.1:1234"})
if err != nil {
t.Fatalf("failed to add learner member %v", err)
}
numberOfLearners := 0
for _, m := range resp1.Members {
if m.IsLearner {
numberOfLearners++
}
}
if numberOfLearners != 0 {
t.Errorf("learner promoted, expect 0 learner, got %d", numberOfLearners)
if numberOfLearners != 1 {
t.Fatalf("Added 1 learner node to cluster, got %d", numberOfLearners)
}

// 3. cluster has 3 voting member and 1 learner, adding another learner should fail
_, err = clus.Client(0).MemberAddAsLearner(context.Background(), []string{"http://127.0.0.1:2345"})
if err == nil {
t.Fatalf("expect member add to fail, got no error")
}
expectedErrKeywords := "too many learner members in cluster"
if !strings.Contains(err.Error(), expectedErrKeywords) {
t.Fatalf("expecting error to contain %s, got %s", expectedErrKeywords, err.Error())
}

// 4. cluster has 3 voting member and 1 learner, adding a voting member should succeed
_, err = clus.Client(0).MemberAdd(context.Background(), []string{"http://127.0.0.1:3456"})
if err != nil {
t.Errorf("failed to add member %v", err)
}
}
51 changes: 48 additions & 3 deletions clientv3/integration/kv_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1011,9 +1011,8 @@ func TestKVForLearner(t *testing.T) {
}
defer cli.Close()

// TODO: expose servers's ReadyNotify() in test and use it instead.
// waiting for learner member to catch up applying the config change entries in raft log.
time.Sleep(3 * time.Second)
// wait until learner member is ready
<-clus.Members[3].ReadyNotify()

tests := []struct {
op clientv3.Op
Expand Down Expand Up @@ -1051,3 +1050,49 @@ func TestKVForLearner(t *testing.T) {
}
}
}

// TestBalancerSupportLearner verifies that balancer's retry and failover mechanism supports cluster with learner member
func TestBalancerSupportLearner(t *testing.T) {
defer testutil.AfterTest(t)

clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 3})
defer clus.Terminate(t)

// we have to add and launch learner member after initial cluster was created, because
// bootstrapping a cluster with learner member is not supported.
clus.AddAndLaunchLearnerMember(t)

learners, err := clus.GetLearnerMembers()
if err != nil {
t.Fatalf("failed to get the learner members in cluster: %v", err)
}
if len(learners) != 1 {
t.Fatalf("added 1 learner to cluster, got %d", len(learners))
}

// clus.Members[3] is the newly added learner member, which was appended to clus.Members
learnerEp := clus.Members[3].GRPCAddr()
cfg := clientv3.Config{
Endpoints: []string{learnerEp},
DialTimeout: 5 * time.Second,
DialOptions: []grpc.DialOption{grpc.WithBlock()},
}
cli, err := clientv3.New(cfg)
if err != nil {
t.Fatalf("failed to create clientv3: %v", err)
}
defer cli.Close()

// wait until learner member is ready
<-clus.Members[3].ReadyNotify()

if _, err := cli.Get(context.Background(), "foo"); err == nil {
t.Fatalf("expect Get request to learner to fail, got no error")
}

eps := []string{learnerEp, clus.Members[0].GRPCAddr()}
cli.SetEndpoints(eps...)
if _, err := cli.Get(context.Background(), "foo"); err != nil {
t.Errorf("expect no error (balancer should retry when request to learner fails), got error: %v", err)
}
}
Loading

0 comments on commit 77e1c37

Please sign in to comment.