Skip to content
This repository has been archived by the owner on Nov 12, 2024. It is now read-only.

LogMatch invariant violated #58

Open
colin-scott opened this issue Aug 6, 2015 · 2 comments
Open

LogMatch invariant violated #58

colin-scott opened this issue Aug 6, 2015 · 2 comments
Milestone

Comments

@colin-scott
Copy link
Contributor

Hello,

Through fuzz testing I found a test case that causes Raft's "Log Matching" invariant to be violated. "Log Matching" is defined as:

"if two logs contain an entry with the same index and term, then the logs are identical in all entries up through the given index"

One issue seems to be the following: Upon transitioning from Candidate to Leader, the Candidate sends an ElectedAsLeader message to itself. It is possible that the newly elected Leader will receive ClientMessages before the ElectedAsLeader message is delivered, e.g. if it had the ClientMessages in its mailbox before the transition occurred. Upon receiving the ElectedAsLeader message, the Leader then reinitializes its nextIndex and matchIndex fields (overwriting what they were previously after applying the client commands). It then sends AppendEntries messages that do not include the 1st client command.

The fix should be simple: initialize the nextIndex and matchIndex fields before transitioning to Leader (in ElectionMeta.forLeader()), rather than after ElectedAsLeader is delivered.

In more detail:

  • Suppose that before becoming Leader, the Candidate has two ClientMessages in its mailbox.
  • Candidate becomes elected, sends ElectedAsLeader message to itself.
  • Upon receiving the first ClientMessage, it adds the command to its log and sends out AppendEntries messages containing that command to all peers. Those AppendEntries messages are delayed
  • Upon receiving the second ClientMessage, the leader adds the command to its log, and sends out new AppendEntries messages containing both commands to all peers. These messages are also delayed.
  • The leader now receives the ElectedAsLeader message, and sets nextIndex = LogIndexMap.initialize(members, replicatedLog.lastIndex). The value of replicatedLog.lastIndex is currently 1, not 0.
  • The leader now sends out AppendEntries messages with fromIndex=nextIndex.valueFor(member) == 1, i.e. the AppendEntries messages include the 2nd but not the 1st client command. prevLogIndex for this entry is 1 - 1 == 0.
  • Upon receiving the AppendEntries containing the second command, followers pass the consistency check [since prevLogIndex for the entry is 0] and add the 2nd command to their log.
  • Now we have inconsistent logs: the leader's log contains (1st command, 2nd command) whereas the followers logs contain (2nd command)

I believe that #37 describes another issue here: prevLogIndex should be -1 for an empty log, not 0 [or, alternatively, the log should be 1-indexed instead of 0-indexed]. If followers understood that prevLogIndex should be 0 for a log containing one entry, they would have failed the consistency check for the AppendEntries command, and properly rejected it.

Here are the reproducing steps for this execution:

$ git clone -b raft-58 git@github.com:NetSys/sts2-applications.git
$ git clone git@github.com:NetSys/sts2-experiments.git experiments
$ sbt assembly
$ java -d64 -Xmx15g -cp target/scala-2.11/randomSearch-assembly-0.1.jar Main 2>&1 | tee console.out

Here are my annotated notes on each step of the execution.

For what it's worth, this snapshot of the akka-raft code differs slightly from the master branch. In particular, it includes (hacky) fixes for all of the known leader election bugs, and a few printlns within raft.

@colin-scott
Copy link
Contributor Author

After thinking about this a bit more:

Delayed nextIndex initialization is not the main issue here. That causes a separate issue (IndexOutOfBounds exception at the leader because its nextIndex gets into a non-sensical state).

The main issue here is that the initial value of prevLogIndex is ambiguous: followers can't distinguish between an AppendEntries for an empty log (prevLogIndex == 0), an AppendEntries for the leader's 1st command (prevLogIndex == 0), and an AppendEntries for the leader's 2nd command (prevLogIndex == 1 - 1 == 0). The last two cases need to be distinguishable. Otherwise followers won't be able to reject inconsistent logs.

@colin-scott
Copy link
Contributor Author

For what it's worth, I have a (non-pull-request-worthy) fix for this issue here:

NetSys/demi-applications@1840499
NetSys/demi-applications@705f8de
NetSys/demi-applications@6f15bcf
NetSys/demi-applications@e5eee3d
NetSys/demi-applications@e6774cd
NetSys/demi-applications@f2c801f

[initialization: https://github.com/NetSys/sts2-applications/commit/dfc36176e0f5e0d70dd3f8278e55245d807d184c]

Tested by fuzz testing -- I verified that under non-failing scenarios, this behaves correctly at each step of the execution when replicating two client commands.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants