You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 12, 2024. It is now read-only.
Through fuzz testing I found a test case that causes Raft's "Log Matching" invariant to be violated. "Log Matching" is defined as:
"if two logs contain an entry with the same index and term, then the logs are identical in all entries up through the given index"
One issue seems to be the following: Upon transitioning from Candidate to Leader, the Candidate sends an ElectedAsLeader message to itself. It is possible that the newly elected Leader will receive ClientMessages before the ElectedAsLeader message is delivered, e.g. if it had the ClientMessages in its mailbox before the transition occurred. Upon receiving the ElectedAsLeader message, the Leader then reinitializes its nextIndex and matchIndex fields (overwriting what they were previously after applying the client commands). It then sends AppendEntries messages that do not include the 1st client command.
The fix should be simple: initialize the nextIndex and matchIndex fields before transitioning to Leader (in ElectionMeta.forLeader()), rather than after ElectedAsLeader is delivered.
In more detail:
Suppose that before becoming Leader, the Candidate has two ClientMessages in its mailbox.
Candidate becomes elected, sends ElectedAsLeader message to itself.
Upon receiving the first ClientMessage, it adds the command to its log and sends out AppendEntries messages containing that command to all peers. Those AppendEntries messages are delayed
Upon receiving the second ClientMessage, the leader adds the command to its log, and sends out new AppendEntries messages containing both commands to all peers. These messages are also delayed.
The leader now receives the ElectedAsLeader message, and sets nextIndex = LogIndexMap.initialize(members, replicatedLog.lastIndex). The value of replicatedLog.lastIndex is currently 1, not 0.
The leader now sends out AppendEntries messages with fromIndex=nextIndex.valueFor(member) == 1, i.e. the AppendEntries messages include the 2nd but not the 1st client command. prevLogIndex for this entry is 1 - 1 == 0.
Upon receiving the AppendEntries containing the second command, followers pass the consistency check [since prevLogIndex for the entry is 0] and add the 2nd command to their log.
Now we have inconsistent logs: the leader's log contains (1st command, 2nd command) whereas the followers logs contain (2nd command)
I believe that #37 describes another issue here: prevLogIndex should be -1 for an empty log, not 0 [or, alternatively, the log should be 1-indexed instead of 0-indexed]. If followers understood that prevLogIndex should be 0 for a log containing one entry, they would have failed the consistency check for the AppendEntries command, and properly rejected it.
Here are the reproducing steps for this execution:
Here are my annotated notes on each step of the execution.
For what it's worth, this snapshot of the akka-raft code differs slightly from the master branch. In particular, it includes (hacky) fixes for all of the known leader election bugs, and a few printlns within raft.
The text was updated successfully, but these errors were encountered:
Delayed nextIndex initialization is not the main issue here. That causes a separate issue (IndexOutOfBounds exception at the leader because its nextIndex gets into a non-sensical state).
The main issue here is that the initial value of prevLogIndex is ambiguous: followers can't distinguish between an AppendEntries for an empty log (prevLogIndex == 0), an AppendEntries for the leader's 1st command (prevLogIndex == 0), and an AppendEntries for the leader's 2nd command (prevLogIndex == 1 - 1 == 0). The last two cases need to be distinguishable. Otherwise followers won't be able to reject inconsistent logs.
Tested by fuzz testing -- I verified that under non-failing scenarios, this behaves correctly at each step of the execution when replicating two client commands.
Hello,
Through fuzz testing I found a test case that causes Raft's "Log Matching" invariant to be violated. "Log Matching" is defined as:
One issue seems to be the following: Upon transitioning from Candidate to Leader, the Candidate sends an
ElectedAsLeader
message to itself. It is possible that the newly elected Leader will receiveClientMessage
s before theElectedAsLeader
message is delivered, e.g. if it had the ClientMessages in its mailbox before the transition occurred. Upon receiving theElectedAsLeader
message, the Leader then reinitializes itsnextIndex
andmatchIndex
fields (overwriting what they were previously after applying the client commands). It then sendsAppendEntries
messages that do not include the 1st client command.The fix should be simple: initialize the
nextIndex
andmatchIndex
fields before transitioning to Leader (inElectionMeta.forLeader()
), rather than after ElectedAsLeader is delivered.In more detail:
nextIndex = LogIndexMap.initialize(members, replicatedLog.lastIndex)
. The value ofreplicatedLog.lastIndex
is currently 1, not 0.fromIndex=nextIndex.valueFor(member) == 1
, i.e. the AppendEntries messages include the 2nd but not the 1st client command. prevLogIndex for this entry is1 - 1 == 0
.I believe that #37 describes another issue here: prevLogIndex should be -1 for an empty log, not 0 [or, alternatively, the log should be 1-indexed instead of 0-indexed]. If followers understood that prevLogIndex should be 0 for a log containing one entry, they would have failed the consistency check for the AppendEntries command, and properly rejected it.
Here are the reproducing steps for this execution:
Here are my annotated notes on each step of the execution.
For what it's worth, this snapshot of the akka-raft code differs slightly from the master branch. In particular, it includes (hacky) fixes for all of the known leader election bugs, and a few printlns within raft.
The text was updated successfully, but these errors were encountered: