-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Cleanup RFCs on branch generation and leader rotation #1967
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Branch Generation | ||
|
||
The goal of this RFC is to define how Solana generates branches. | ||
|
||
|
||
## Basic Design Idea | ||
|
||
Nodes take turns being leader and generating the PoH that encodes state changes. The network can tolerate loss of connection to any leader by synthesizing what the leader ***would have generated*** had it been connected but not ingesting any state changes. The complexity of forks is thereby limited to a "there/not-there" skip list of branches that may arise on leader rotation slot boundaries. A leader can only transmit durring their predefined PoH slot. | ||
|
||
## Message Flow | ||
|
||
1. Transactions are ingested at the current leader. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. at -> by |
||
2. Leader filters for valid transactions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no need for "for" here |
||
3. Leader executes valid transactions on its state. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "on its state" -> ", updating its state." |
||
4. Leader packages transactions into entries based off its current PoH slot. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More precisely, it uses the PoH slot to enter the leader role. |
||
5. Leader transmits the entries to validator nodes (in signed blobs) | ||
a. The PoH stream includes ticks; empty entries that indicate liveness of the leader and the passage of time on the network. | ||
b. A leader's stream begins with the tick entries necessary complete the PoH back to the leaders most recently observed prior leader slot. | ||
6. Validators retransmit entries to peers in their set and to further downstream nodes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "in their set" should either be removed or should be followed up with a description of what defines that set. |
||
7. Validators validate the transactions and execute them on their state. | ||
8. Validators compute the hash of the state. | ||
9. At specific times, i.e. specific PoH tick counts, validators transmit votes to the leader. | ||
a. Votes are signatures of the hash of the computed state at that PoH tick count | ||
b. Votes are also propagated via gossip | ||
10. Leader executes the votes as any other transaction and broadcasts them to the network | ||
11. Validators observe their votes and all the votes from the network. | ||
|
||
## Partitions, Forks | ||
|
||
Forks can arise at PoH tick counts that correspond to a vote. The next leader may not have observed the last vote slot and may start their slot with generated virtual PoH entries. These empty ticks are generated by all nodes in the network at a network-specified rate for hashes/per/tick `Z`. | ||
|
||
There are only two possible versions of the PoH during a voting slot: PoH with `T` ticks and entries generated by the current leader, or PoH with just ticks. The "just ticks" version of the PoH can be thought of as a virtual ledger, one that all nodes in the network can derive from the last tick in the previous slot. | ||
|
||
Validators can ignore forks at other points (e.g. from the wrong leader), or slash the leader responsible for the fork. | ||
|
||
Validators vote based on a greedy choice to maximze their reward described in [branch selection](rfcs/0008-branch_selection.md). | ||
|
||
### Validator's View | ||
|
||
#### Time Progression | ||
The diagram below represents a validator's view of the PoH stream with possible forks over time. L1, L2, etc. are leader slot, and `E`s represent entries from that leader during that leader's slot. The 'x's represent ticks only, and time flows downwards in the diagram. | ||
|
||
|
||
``` | ||
time +----+ validator action | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you copy in |
||
| | L1 | E(L1) | ||
| |----| / \ vote(E(L2)) | ||
| | L2 | E(L2) x | ||
| |----| / \ / \ vote(E(L2)) | ||
| | L3 | E(L3) x E(L3)' x | ||
| |----| / \ / \ / \ / \ slash(L3) | ||
| | L4 | x x E(L4) x x x x x | ||
V |----| | | | | | | | | vote(E(L4)) | ||
V | L5 | xx xx xx E(L5) xx xx xx xx | ||
V +----+ hang on to E(L4) and E(L5) for more... | ||
|
||
``` | ||
|
||
Note that an `E` appearing on 2 branches at the same slot is a slashable condition, so a validator observing `E(L3)` and `E(L3)'` can slash L3 and safely choose `x` for that slot. Once a validator commits to a branch, other branches can be discarded below that tick count. For any slot, validators need only consider a single "has entries" chain or a "ticks only" chain to be proposed by a leader. But multiple virtual entries may overlap as they link back to the a previous slot. | ||
|
||
#### Time Division | ||
|
||
It's useful to consider leader rotation over PoH tick count as time division of the job of encoding state for the network. The following table presents the above tree of forks as a time-divided ledger. | ||
|
||
leader slot | L1 | L2 | L3 | L4 | L5 | ||
-------|----|----|----|----|---- | ||
data | E(L1)| E(L2) | E(L3) | E(L4) | E(L5) | ||
ticks to prev | | | | x | xx | ||
|
||
Note that only data from leader L3 will be accepted during leader slot L3. Data from L3 may include "catchup" ticks back to a slot other than L2 if L3 did not observe L2's data. L4 and L5's transmissions include the "ticks to prev" PoH entries. | ||
|
||
This arrangement of the network data streams permits nodes to save exactly this to the ledger for replay, restart, and checkpoints. | ||
|
||
### Leader's View | ||
|
||
When a new leader begins a slot, it must first transmit any PoH (ticks) required to link the new slot with the most recently observed and voted slot. The branch the leader proposes would link the current slot to a previous slot that the leader has voted on with virtual ticks. |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,104 +1,29 @@ | ||
# Leader Rotation | ||
|
||
The goal of this RFC is to define how leader nodes are rotated in Solana, how rotation may cause forks to arise, and how the converges | ||
in response. | ||
The goal of this RFC is to define how leader nodes are rotated in Solana. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More precisely, it's to define how fullnodes rotate in taking the leader role. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @garious why not leaders? a leader doesn't need to be a full node per se. |
||
|
||
## Leader Seed Generation | ||
## Leader Schedule Generation | ||
|
||
Leader selection is decided via a random seed. The process is as follows: | ||
Leader schedule is decided via a predefined seed. The process is as follows: | ||
|
||
1. Periodically at a specific `PoH tick count` select the first vote signatures that create a supermajority from the previous voting round. | ||
2. Append them together. | ||
3. Hash the string for `N` counts via a similar process as PoH itself. | ||
4. The resulting hash is the random seed for `M` counts, `M` leader periods, where M > N | ||
1. Periodically at a specific `PoH tick count` use the tick count (simple monotonically increasing counter) as a seed to a stable psudo-random algorithm. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are we concerned that the leader schedule seeds are all pre-compute-able? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nevermind, answered below... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pseudo |
||
2. At that height, compute all the currently staked accounts and their assigned leader identities and weights. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sentence doesn't make sense to me. |
||
3. Sort them by stake weight. | ||
4. Using the random seed select nodes weighted by stake to create a stake weighted ordering. | ||
5. This ordering becomes valid in `N` `PoH tick counts`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That doesn't seem right. Seems like it ought to be something like N counts after the start of the slot. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the leader schedule needs to be available for a good while before it comes into effect to reduce forking that might arise because schedules themselves have forked There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ideally, I'd want to see leader schedules available no later than 32 slots before they come into effect... |
||
|
||
## Leader Rotation | ||
|
||
1. The leader is chosen via a random seed generated from stake weights and votes (the leader schedule) | ||
2. The leader is rotated every `T` PoH ticks (leader period), accoding to the leader schedule | ||
3. The schedule is applicable for `M` voting rounds | ||
|
||
Leader's transmit for a count of `T` PoH ticks. When `T` is reached all the validators should switch to the next scheduled leader. To schedule leaders, the supermajority + `M` nodes are shuffled using the above calculated random seed. | ||
|
||
All `T` ticks must be observed from the current leader for that part of PoH to be accepted by the network. If `T` ticks (and any intervening transactions) are not observed, the network optimistically fills in the `T` ticks, and continues with PoH from the next leader. | ||
|
||
## Partitions, Forks | ||
|
||
Forks can arise at PoH tick counts that correspond to leader rotations, because leader nodes may or may not have observed the previous leader's data. These empty ticks are generated by all nodes in the network at a network-specified rate for hashes/per/tick `Z`. | ||
|
||
There are only two possible versions of the PoH during a voting period: PoH with `T` ticks and entries generated by the current leader, or PoH with just ticks. The "just ticks" version of the PoH can be thought of as a virtual ledger, one that all nodes in the network can derive from the last tick in the previous period. | ||
|
||
Validators can ignore forks at other points (e.g. from the wrong leader), or slash the leader responsible for the fork. | ||
|
||
Validators vote on the longest chain that contains their previous vote, or a longer chain if the lockout on their previous vote has expired. | ||
|
||
|
||
#### Validator's View | ||
|
||
##### Time Progression | ||
The diagram below represents a validator's view of the PoH stream with possible forks over time. L1, L2, etc. are leader periods, and `E`s represent entries from that leader during that leader's period. The 'x's represent ticks only, and time flows downwards in the diagram. | ||
|
||
|
||
``` | ||
time +----+ validator action | ||
| | L1 | E(L1) | ||
| |----| / \ vote(E(L2)) | ||
| | L2 | E(L2) x | ||
| |----| / \ / \ vote(E(L2)) | ||
| | L3 | E(L3) x E(L3)' x | ||
| |----| / \ / \ / \ / \ slash(L3) | ||
| | L4 | x x E(L4) x x x x x | ||
V |----| | | | | | | | | vote(E(L4)) | ||
V | L5 | xx xx xx E(L5) xx xx xx xx | ||
V +----+ hang on to E(L4) and E(L5) for more... | ||
The seed that is selected is predictable but unbiasable. There is no grinding attack to influence its outcome. The set of **staked accounts** and their leader identities is computed over a large period, which is our approach to censorship resistance of the staking set. If at least 1 leader in the schedule is not censoring staking transactions then over a long period of time that leader can ensure that the set of active nodes is not censored. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "our approach to censorship resistance" -> "is used to resist censorship". Also, how? Alternatively say that's out of the scope of this document. Also, typos in the last sentence. Two missing commas and "is" should be "are". That last sentence should either be dropped (out of scope) or expanded on a bunch. |
||
|
||
``` | ||
|
||
Note that an `E` appearing on 2 branches at the same period is a slashable condition, so a validator observing `E(L3)` and `E(L3)'` can slash L3 and safely choose `x` for that period. Once a validator observes a supermajority vote on any branch, other branches can be discarded below that tick count. For any period, validators need only consider a single "has entries" chain or a "ticks only" chain. | ||
|
||
##### Time Division | ||
|
||
It's useful to consider leader rotation over PoH tick count as time division of the job of encoding state for the network. The following table presents the above tree of forks as a time-divided ledger. | ||
|
||
leader period | L1 | L2 | L3 | L4 | L5 | ||
-------|----|----|----|----|---- | ||
data | E(L1)| E(L2) | E(L3) | E(L4) | E(L5) | ||
ticks to prev | | | | x | xx | ||
|
||
Note that only data from leader L3 will be accepted during leader period L3. Data from L3 may include "catchup" ticks back to a period other than L2 if L3 did not observe L2's data. L4 and L5's transmissions include the "ticks to prev" PoH entries. | ||
|
||
This arrangement of the network data streams permits nodes to save exactly this to the ledger for replay, restart, and checkpoints. | ||
|
||
#### Leader's View | ||
|
||
When a new leader begins a period, it must first transmit any PoH (ticks) required to link the new period with the most recently observed and voted period. | ||
|
||
|
||
## Examples | ||
## Leader Rotation | ||
|
||
### Small Partition | ||
1. Network partition M occurs for 10% of the nodes | ||
2. The larger partition K, with 90% of the stake weight continues to operate as normal | ||
3. M cycles through the ranks until one of them is leader, generating ticks for periods where the leader is in K. | ||
4. M validators observe 10% of the vote pool, finality is not reached. | ||
5. M and K re-connect. | ||
6. M validators cancel their votes on M, which has not reached finality, and re-cast on K (after their vote lockout on M). | ||
* The leader is rotated every `T` PoH ticks (leader period), accoding to the leader schedule. This amount of time as represented by the PoH ticks is called a slot. | ||
|
||
### Leader Timeout | ||
1. Next rank leader node V observes a timeout from current leader A, fills in A's period with virtual ticks and starts sending out entries. | ||
2. Nodes observing both streams keep track of the forks, waiting for: | ||
a. their vote on leader A to expire in order to be able to vote on B | ||
b. a supermajority on A's period | ||
3. If a occurs, leader B's period is filled with ticks, if b occurs, A's period is filled with ticks | ||
4. Partition is resolved just like in the [Small Partition](#small-parition) | ||
Leader's transmit for a count of `T` PoH ticks. When `T` is reached all the validators should switch to the next scheduled leader. Leaders that transmit out of order can be ignored. | ||
|
||
All `T` ticks must be observed from the current leader for that part of PoH to be accepted by the network. If `T` ticks (and any intervening transactions) are not observed, the network optimistically fills in the `T` ticks, and continues with PoH from the next leader. See [branch generation](rfcs/0002-branch_generation.md). | ||
|
||
## Network Variables | ||
|
||
`M` - number of nodes outside the supermajority to whom leaders broadcast their PoH for validation | ||
|
||
`N` - number of voting rounds for which a leader schedule is considered before a new leader schedule is used | ||
|
||
`T` - number of PoH ticks per leader period (also voting period) | ||
`N` - Number of voting rounds for which a leader schedule is considered before a new leader schedule is used. This number should be large and potentially cover 2 weeks. | ||
|
||
`Z` - number of hashes per PoH tick | ||
`T` - Number of PoH ticks per leader slot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solana doesn't generate branches. And if "branches" are different than "forks", you'll want to explain that here.