-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Coupling block sync to DAG state #3386
base: staging
Are you sure you want to change the base?
Conversation
Ok(is_recently_committed) => { | ||
if !is_recently_committed { | ||
bail!( | ||
"Sync - Failed to advance blocks - leader certificate with author {leader_author} from round {leader_round} was not recently committed.", | ||
); | ||
} | ||
debug!( | ||
"Sync - Leader certificate with author {leader_author} from round {leader_round} was recently committed.", | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a nit but the boolean can be moved into the match statement you get a compile warning if missing a branch and this has better readability IMO.
i.e:
Ok(true) => {
debug!("...");
}
Ok(false) => {
bail!("...");
}
// rest of cases
self.spawn(async move { | ||
while let Some(((round, certificate_id), callback)) = rx_is_recently_committed.recv().await { | ||
// Check if the certificate was recently committed. | ||
let is_committed = self_.dag.read().is_recently_committed(round, certificate_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like a lot of hassle just to perform this one check (in its own task with a dedicated channel), especially with both the BFT
and Dag
being clonable 🤔... it does make it async, but do we expect the recent_committed_ids
collection to grow large enough to make it blocking?
Can this PR be changed to draft state? No need to get it into next release. |
Motivation
This PR focuses on coupling block sync to DAG state replication. When a node is syncing via block responses, it will sync its storage and DAG with the certificates contained in the block and attempt to update its ledger. Previously, there were scenarios where a node would commit certificates in its DAG without advancing blocks. Instead, the committal of leader certificates and advancement of blocks during sync should be tightly coupled.
To achieve this, if a node is syncing, it should forgo committing leader certificates on a rolling basis inside
update_dag
. Instead, the leader certificate should be committed just before the ledger is ready to advance to the next block during sync. To facilitate this, we use a sender channel that communicates the leader certificate to be committed from the Sync module to the BFT.A previous version of this PR can be found here #3268 .
The differences are that the previous PR does not prevent syncing nodes from committing leader certificates within
update_dag
, and that certificates inside block responses are added to the DAG all at once when the availability threshold is met. For contrast, in this PR, we maintain the original method of updating the DAG as soon as the certificates in the block response are processed withinsync_storage_with_block
, but only commit the leader certificate when the ledger is ready to advance to the next block. Updating the DAG as soon as the certificates are processed is necessary to ensure there is no discrepancy between the storage and the DAG state beyond the latest committed block.To summarize, this PR makes the following changes:
update_dag
.sync_storage_with_block
just before the ledger advances blocks.Test Plan
Relevant BFT test cases include the following:
2. BFT-Rebonding
12. Sync-Invalid-Peers-Attack
13. Sync-Far-Behind
Related PRs
#3268