Use a single set of Progresses for ProgressSet. #108

Hoverbear · 2018-08-25T06:42:16Z

The motivation for this PR may be unclear until you consider future work. But, shortly, this is partially in preparation for Joint Consensus.

This is a step to allowing us to use different configurations (eg voter/learner topology) on top of the same Progress states. This will mean we can have ProgressSet be aware of its peer set changing, and able to have calls like self.prs().voters() or self.prs().learner_nodes() always remain accurate.

Rationale

Currently we store Progress data, which are of a non-trivial size, in two FxHashMaps in ProgressSet. They are holding voters and learners respectively. During add_node(), add_learner(), remove_node(), and promote_learner() we currently check sometimes multiple HashMaps, and at times need to move Progress values between maps. We don't need to do this.

Typical use cases for ProgressSet include iterating over these peers (such as self.prs().voters()), fetching individual peers (such as self.prs().voters().get(id)), checking membership (self.prs().voters().contains_key()), removing nodes (either role), adding voters, adding learners, promoting learners, or removing learners.

Scanning the entire set is done by chaining iterators over the two hashmaps already, so storing them in a single HashMap has no penalty. Returning Iterators where possible instead of realized structures means we can possibly save on allocations (you can do self.prs().voters().any(_) faster now).

Scanning over a specific subset (eg voter, learner) is slightly slower since the iterator returned is internally filtering out non-voter/non-learners. Since most node sets are fairly small, and the check is a simple bool, this performance change is not dramatic. (This can be possibly optimized, but I'm not sure it's worth the complexity).

Fetching individual nodes was optimized since it is always a lookup now, not possibly two lookups (in the case of getting a node without the specifier).

Removing nodes should also be faster (single lookup).

Promoting Learners now is just changing a boolean (but this will likely change later for Joint Consensus).

Future Work

In the next PR my thought is to introduce poll and quorum logic to ProgressSet, instead of being in Raft (Raft can still proxy to them). Why? When we enter a joint consensus (#101 ) state we no longer can just check if # of votes >= quorum. The cluster needs to account for the votes of nodes in two separate configurations. So the check for elections is # of votes >= quorum or (# votes_old >= quorum_old) && (# votes_new >= quorum_new) depending on the state of the system.

I'd like to follow with a PR to have ProgressSet hold a ConfState (a vec of voters, and a
vec of learners in a protobuf), and this FxHashMap<u64, Progress> that ProgressSet already
holds. This will allow us to in the future use an enum inside the
ProgressSet:

enum Configuration {
    Stable(ConfState),
    Joint { old: ConfState, new: ConfState },
}

It may be wise to use a different datatype for the ConfStates than the protobuf though. This needs to be explored. Perhaps a FxHashMap<u64, Weak<Progress>> pointing back into the actual Progress, so we can just follow the pointer.

This will allow us to quickly pass lists of IDs to callers and compose iterators of progress with chains, mapping into the HashMap. (If we do this, the Weak/Rc idea above is more compelling perhaps).

Performance Impact

There were no major performance impacts. A few benchmarks were slightly slower or slightly faster, but no remarkable impacts. Note many tests are a small % slower due to the additional allocation involved since we're now holding a bit of extra data.

   Compiling raft v0.3.1 (file:///home/hoverbear/git/raft-rs)
    Finished release [optimized] target(s) in 7.01s
     Running target/release/deps/benches-b3f4a3ccb54b1af9
Raft::new (0, 0)        time:   [654.68 ns 658.33 ns 663.60 ns]                              
                        change: [+2.4424% +4.1041% +5.5835%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe

Raft::new (3, 1)        time:   [1.0552 us 1.0571 us 1.0592 us]                              
                        change: [+5.3398% +7.4368% +9.4033%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

Raft::new (5, 2)        time:   [1.2448 us 1.2469 us 1.2492 us]                              
                        change: [+3.0618% +5.4548% +7.7207%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  9 (9.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe

Raft::new (7, 3)        time:   [1.5091 us 1.5169 us 1.5285 us]                              
                        change: [+6.9430% +8.0638% +9.3565%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

Raft::campaign (3, 1, CampaignPreElection)                                                                             
                        time:   [1.6698 us 1.6796 us 1.6914 us]
                        change: [+7.1247% +8.9953% +11.082%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

Raft::campaign (3, 1, CampaignElection)                                                                             
                        time:   [1.8421 us 1.8543 us 1.8701 us]
                        change: [-0.7334% +0.9614% +2.9925%] (p = 0.36 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

Raft::campaign (3, 1, CampaignTransfer)                                                                             
                        time:   [1.9347 us 1.9791 us 2.0401 us]
                        change: [+4.6167% +6.8336% +9.7574%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

Raft::campaign (5, 2, CampaignPreElection)                                                                             
                        time:   [2.3698 us 2.3725 us 2.3759 us]
                        change: [-0.5268% +0.8138% +2.1554%] (p = 0.24 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

Raft::campaign (5, 2, CampaignElection)                                                                             
                        time:   [2.5981 us 2.6072 us 2.6228 us]
                        change: [+5.1047% +6.6554% +8.2343%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

Raft::campaign (5, 2, CampaignTransfer)                                                                             
                        time:   [2.7336 us 2.7424 us 2.7546 us]
                        change: [-2.8336% -1.6015% -0.2993%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Raft::campaign (7, 3, CampaignPreElection)                                                                             
                        time:   [3.0390 us 3.0577 us 3.0808 us]
                        change: [+6.2413% +8.4249% +10.575%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Raft::campaign (7, 3, CampaignElection)                                                                             
                        time:   [3.3627 us 3.3815 us 3.4033 us]
                        change: [+3.5173% +5.6968% +7.8047%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

Raft::campaign (7, 3, CampaignTransfer)                                                                             
                        time:   [3.5182 us 3.5326 us 3.5498 us]
                        change: [+0.1086% +1.5806% +3.0729%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

RawNode::new            time:   [957.90 ns 962.74 ns 968.28 ns]                          
                        change: [+0.4930% +2.1651% +3.6894%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Progress::default       time:   [2.7608 ns 2.7694 ns 2.7820 ns]                               
                        change: [+3.4533% +4.0793% +4.7229%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

ProgressSet::new (0, 0) time:   [24.331 ns 24.464 ns 24.622 ns]                                    
                        change: [+12.726% +13.755% +14.741%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::new (3, 1) time:   [77.597 ns 77.870 ns 78.208 ns]                                    
                        change: [+4.2794% +5.0575% +5.8202%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

ProgressSet::new (5, 2) time:   [78.236 ns 78.473 ns 78.770 ns]                                    
                        change: [+4.1984% +5.2535% +6.2644%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

ProgressSet::new (7, 3) time:   [78.297 ns 78.467 ns 78.690 ns]                                    
                        change: [+3.2772% +4.1558% +5.3449%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

ProgressSet::insert_voter (0, 0)                                                                            
                        time:   [82.880 ns 83.059 ns 83.289 ns]
                        change: [-2.3601% -1.5005% -0.6649%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ProgressSet::insert_voter (3, 1)                                                                             
                        time:   [195.68 ns 196.88 ns 198.43 ns]
                        change: [+3.8072% +5.1527% +6.6821%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

ProgressSet::insert_voter (5, 2)                                                                             
                        time:   [238.59 ns 240.82 ns 243.56 ns]
                        change: [+9.6677% +11.744% +13.744%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

ProgressSet::insert_voter (7, 3)                                                                             
                        time:   [250.93 ns 252.21 ns 253.76 ns]
                        change: [+9.5544% +10.656% +11.673%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::insert_learner (0, 0)                                                                             
                        time:   [84.346 ns 84.675 ns 85.032 ns]
                        change: [+4.0566% +5.3939% +6.6914%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

ProgressSet::insert_learner (3, 1)                                                                             
                        time:   [194.62 ns 195.77 ns 197.10 ns]
                        change: [+8.5062% +9.9784% +11.317%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::insert_learner (5, 2)                                                                             
                        time:   [216.15 ns 217.55 ns 219.23 ns]
                        change: [+9.6463% +11.074% +12.443%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

ProgressSet::insert_learner (7, 3)                                                                             
                        time:   [236.09 ns 237.22 ns 238.59 ns]
                        change: [+6.0486% +7.3821% +8.6658%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

ProgressSet::promote (0, 0)                                                                            
                        time:   [30.150 ns 30.321 ns 30.517 ns]
                        change: [-0.8956% +0.9434% +2.8369%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

ProgressSet::promote (3, 1)                                                                             
                        time:   [174.10 ns 174.50 ns 175.02 ns]
                        change: [-1.6114% +0.7134% +2.4685%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

ProgressSet::promote (5, 2)                                                                             
                        time:   [206.81 ns 207.74 ns 208.90 ns]
                        change: [+12.212% +13.224% +14.202%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::promote (7, 3)                                                                             
                        time:   [225.56 ns 227.15 ns 229.03 ns]
                        change: [-0.7211% +1.0632% +2.8948%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

ProgressSet::remove (0, 0)                                                                            
                        time:   [65.105 ns 65.554 ns 66.085 ns]
                        change: [+8.8515% +10.210% +11.544%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::remove (3, 1)                                                                             
                        time:   [210.29 ns 211.60 ns 213.16 ns]
                        change: [+7.3192% +8.9684% +10.540%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::remove (5, 2)                                                                             
                        time:   [312.97 ns 319.75 ns 327.84 ns]
                        change: [+40.410% +43.287% +45.910%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::remove (7, 3)                                                                             
                        time:   [292.14 ns 294.15 ns 296.60 ns]
                        change: [+22.410% +23.951% +25.465%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

ProgressSet::iter (0, 0)                                                                            
                        time:   [32.426 ns 32.682 ns 32.986 ns]
                        change: [+19.914% +21.344% +22.678%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::iter (3, 1)                                                                             
                        time:   [215.07 ns 217.18 ns 219.64 ns]
                        change: [+24.669% +26.020% +27.425%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

ProgressSet::iter (5, 2)                                                                             
                        time:   [232.52 ns 234.99 ns 237.96 ns]
                        change: [+18.472% +20.308% +21.988%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

ProgressSet::iter (7, 3)                                                                             
                        time:   [255.43 ns 262.34 ns 272.19 ns]
                        change: [+14.089% +16.026% +18.440%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

ProgressSet::get (0, 0) time:   [24.563 ns 24.764 ns 25.007 ns]                                    
                        change: [+19.712% +21.106% +22.553%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::get (3, 1) time:   [178.88 ns 180.24 ns 181.89 ns]                                     
                        change: [+17.606% +19.109% +20.784%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

ProgressSet::get (5, 2) time:   [198.47 ns 199.85 ns 201.50 ns]                                     
                        change: [+17.484% +18.857% +20.212%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::get (7, 3) time:   [222.11 ns 223.86 ns 226.02 ns]                                     
                        change: [+8.8628% +10.119% +11.473%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::nodes (0, 0)                                                                            
                        time:   [30.588 ns 30.741 ns 30.935 ns]
                        change: [+12.690% +14.065% +15.416%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::nodes (3, 1)                                                                             
                        time:   [199.91 ns 201.45 ns 203.25 ns]
                        change: [+12.052% +13.444% +14.856%] (p = 0.00 < 0.05)
                        Performance has regressed.

ProgressSet::nodes (5, 2)                                                                             
                        time:   [231.50 ns 233.10 ns 235.04 ns]
                        change: [+9.1243% +10.754% +12.436%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

ProgressSet::nodes (7, 3)                                                                             
                        time:   [250.84 ns 252.33 ns 254.05 ns]
                        change: [+2.3537% +3.7544% +5.0726%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

siddontang · 2018-08-26T03:31:28Z

@Hoverbear

now we have to calculate quorum every time with travelling the whole map instead of using the length function, this may hurt performance.

Do you still remember any other reason to separate them? @BusyJay @hicqu

BusyJay · 2018-08-26T03:53:51Z

I don't have strong opinion on separate them.

Hoverbear · 2018-08-28T06:10:30Z

@siddontang Seems this doesn't slow down Raft's campaign, I baselined this against master:

Raft::campaign (7, 3, pre_election)                                                                             
                        time:   [3.2914 us 3.3070 us 3.3230 us]
                        change: [-2.4331% -1.9552% -1.4261%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Raft::campaign (7, 3, election)                                                                             
                        time:   [3.5760 us 3.5927 us 3.6090 us]
                        change: [-3.7967% -3.2620% -2.7317%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 29 outliers among 100 measurements (29.00%)
  15 (15.00%) low mild
  10 (10.00%) high mild
  4 (4.00%) high severe

Raft::campaign (7, 3, transfer)                                                                             
                        time:   [3.7463 us 3.7631 us 3.7798 us]
                        change: [-3.7255% -3.1380% -2.5888%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

siddontang · 2018-09-02T01:57:33Z

@Hoverbear

I still worry the performance reduction when calculating quorum, maybe it is better to use a var to store the quorum. /cc @BusyJay

Hoverbear · 2018-09-04T17:27:59Z

@siddontang Ok, that needs to be done for joint consensus anyways so I'll add it to this PR. :)

Hoverbear · 2018-09-06T19:48:20Z

@siddontang PTAL, this uses FxHashSet to store the nodes in voters/learners.

Hoverbear · 2018-09-07T22:38:52Z

I introduced another PR, #119 that builds off this since I think it is ready for merge. I can always rebase it. :)

siddontang · 2018-09-11T09:09:14Z

src/raft.rs

@@ -593,13 +584,14 @@ impl<T: Storage> Raft<T> {
    pub fn maybe_commit(&mut self) -> bool {
        let mut mis_arr = [0; 5];
        let mut mis_vec;
-        let mis = if self.prs().voters().len() <= 5 {
-            &mut mis_arr[..self.prs().voters().len()]
+        let voters = self.prs().voters().count();


why not using num_voters here?

😆 It's a good catch.

src/progress.rs

siddontang

Rest LGTM

BusyJay

Why not just use two ProgressSet?

BusyJay · 2018-09-11T13:44:30Z

src/progress.rs

+            ),
+            configuration: Configuration {
+                voters: HashSet::with_capacity_and_hasher(
+                    voters + learners,


Why sum them up?

Oh, it's a good point!

Hoverbear · 2018-09-11T15:05:28Z

This is so we can move around Progress values between map during promotion. It's simpler (and more efficient) to have a single map and keep membership in a set.

Additionally, this means that during a Joint Consensus state we can avoid having to sync changes and manage multiple Progress values for one node. (Since Joint Consensus would need two ProgressSet, instead of just having two membership pairs).

As you can see in #119 the next PR removes is_learner to prevent that error class, ensuring we rely on the membership here.

I do not want to create a mess, or create ways for us to make mistakes. :)

overvenus

lgtm

BusyJay · 2018-09-26T04:28:31Z

src/progress.rs

-        }
-        if self.learners.contains_key(&id) {
+    pub fn insert_voter(&mut self, id: u64, mut pr: Progress) -> Result<(), Error> {
+        if self.learner_ids().contains(&id) {


In general, the progress will not exist at all. Hence, only one query is necessary even the flag is removed.

BusyJay · 2018-09-26T04:36:14Z

src/progress.rs

+                .union(&self.configuration.learners)
+                .all(|v| self.progress.contains_key(v))
+        );
+        assert_eq!(


I'm afraid the assert is not enough. What if there are 1, 2, 3 in voters and 3, 4, 5 in learners, 1, 2, 3, 4, 5, 6 in progress?

You're right. I'll make it check both ways.

BusyJay · 2018-09-26T04:36:58Z

src/raft.rs

@@ -244,7 +236,7 @@ impl<T: Storage> Raft<T> {
            raft_log,
            max_inflight: c.max_inflight_msgs,
            max_msg_size: c.max_size_per_msg,
-            prs: Some(ProgressSet::new(peers.len(), learners.len())),
+            prs: Some(ProgressSet::new()),


Why not use with_capacity?

BusyJay · 2018-09-26T04:37:33Z

src/raft.rs

@@ -268,14 +260,13 @@ impl<T: Storage> Raft<T> {
            tag: c.tag.to_owned(),
        };
        for p in peers {
-            let pr = new_progress(1, r.max_inflight);
+            let pr = Progress::new(r.raft_log.last_index(), r.max_inflight, false);


Why change the index?

Oh you're right. I'm sorry. 😞

BusyJay · 2018-09-26T04:39:42Z

src/raft.rs

@@ -819,7 +810,7 @@ impl<T: Storage> Raft<T> {
        // Only send vote request to voters.
        let prs = self.take_prs();
        prs.voters()
-            .keys()
+            .map(|(k, _)| k)


Seems it's exact what voter_ids does.

src/raft.rs

Hoverbear · 2018-09-26T16:27:32Z

Thanks for the really great and detailed reviews @BusyJay , @overvenus . I've resolved your comments, can you take another look?

BusyJay · 2018-09-27T03:04:02Z

src/progress.rs

+            // Determine the correct error to return.
+            if self.learner_ids().contains(&id) {
+                return Err(Error::Exists(id, "learners"));
+            } else if self.voter_ids().contains(&id) {


The check seems redundant.

I do not think inserting a voter who is already a voter should be silently ignored. It is a clear sign that the user thinks the system is in a different state than it is. Performance wise this is completely non-consequential.

What I mean if it exists in progress and not in learner_ids, then it must be in voter_ids.

I suppose. I didn't really think hard about it since we cannot do the progress check in the final code anyways, so this is just interm code until #101 . I will remove it.

src/progress.rs

BusyJay · 2018-09-27T03:10:03Z

src/raft.rs

-        if (!self.prs().voters().is_empty() || !self.prs().learners().is_empty())
-            && !self.is_learner
-        {
+        if (self.prs().iter().len() != 0) && !self.is_learner {


() is redundant.

BusyJay

Seems we need more assert in tests to make sure the refactor is correct.

BusyJay · 2018-09-27T09:10:57Z

src/raft.rs

+        if let Some(progress) = self.prs().get(id) {
+            // If progress.is_learner == learner, then it's already inserted as what it should be, return early to avoid error.
+            if progress.is_learner == learner {
+                info!("Ignoring redundant insert.");


tag is missing. And better includes the target id.

BusyJay · 2018-09-27T09:11:24Z

src/raft.rs

-                self.is_learner = false;
+            // If progress.is_learner == false, and learner == true, then it's a demotion, return early to avoid an error.
+            if !progress.is_learner && learner {
+                info!("Ignoring voter demotion.");


BusyJay · 2018-09-27T09:14:10Z

src/raft.rs

+            "Adding node (learner: {}) with ID {} to peers.",
+            learner, id
+        );
+        let progress = Progress::new(self.raft_log.last_index(), self.max_inflight, learner);


Seems more readable to me to move this to L1867. Otherwise it may be confusing as L1855 overlaps the variable name in its scope. And I think it should be last_index + 1.

BusyJay · 2018-09-27T09:18:15Z

src/raft.rs

        }
+        self.is_learner = learner;


Why? I think it's only necessary to be updated when id == self.id.

BusyJay · 2018-09-27T09:33:57Z