Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup RFCs on branch generation and leader rotation #1967

Merged
merged 2 commits into from
Nov 30, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions rfcs/0002-branch-generation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Branch Generation

The goal of this RFC is to define how Solana generates branches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solana doesn't generate branches. And if "branches" are different than "forks", you'll want to explain that here.



## Basic Design Idea

Nodes take turns being leader and generating the PoH that encodes state changes. The network can tolerate loss of connection to any leader by synthesizing what the leader ***would have generated*** had it been connected but not ingesting any state changes. The complexity of forks is thereby limited to a "there/not-there" skip list of branches that may arise on leader rotation periods boundaries. A leader can only transmit durring their predefined PoH slot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"complexity of forks .. list of branches" huh?

on a boundary or after?

A leader can transmit at any time, but it's transactions will only be accepted if they fall within the leader's slot.


## Message Flow

1. Transactions are ingested at the current leader.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at -> by

2. Leader filters for valid transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for "for" here

3. Leader executes valid transactions on its state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"on its state" -> ", updating its state."

4. Leader packages transactions into entries based off its current PoH slot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More precisely, it uses the PoH slot to enter the leader role.

5. Leader transmits the entries to validator nodes (in signed blobs)
a. The PoH stream includes ticks; empty entries that indicate liveness of the leader and the passage of time on the network.
b. A leader's stream begins with the tick entries necessary complete the PoH back to the leaders most recently observed prior leader period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

period => slot everywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leader's

6. Validators retransmit entries to peers in their set and to further downstream nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in their set" should either be removed or should be followed up with a description of what defines that set.

7. Validators validate the transactions and execute them on their state.
8. Validators compute the hash of the state.
9. At specific times, i.e. specific PoH tick counts, validators transmit votes to the leader.
a. Votes are signatures of the hash of the computed state at that PoH tick count
b. Votes are also propagated via gossip
10. Leader executes the votes as any other transaction and broadcasts them to the network
11. Validators observe their votes and all the votes from the network.

## Partitions, Forks

Forks can arise at PoH tick counts that correspond to a vote. The next leader may not have observed the last vote period and may start their slot with generated virtual PoH entries. These empty ticks are generated by all nodes in the network at a network-specified rate for hashes/per/tick `Z`.

There are only two possible versions of the PoH during a voting period: PoH with `T` ticks and entries generated by the current leader, or PoH with just ticks. The "just ticks" version of the PoH can be thought of as a virtual ledger, one that all nodes in the network can derive from the last tick in the previous period.

Validators can ignore forks at other points (e.g. from the wrong leader), or slash the leader responsible for the fork.

Validators vote based on a greedy choise to maximze their reward described in [branch selection](rfcs/0008-branch_selection.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

choice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maximize


### Validator's View

#### Time Progression
The diagram below represents a validator's view of the PoH stream with possible forks over time. L1, L2, etc. are leader periods, and `E`s represent entries from that leader during that leader's period. The 'x's represent ticks only, and time flows downwards in the diagram.


```
time +----+ validator action
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you copy in book/art/leader-scheduler.bob so that it's easy for me to see any changes you make to it? Those parentheses don't render well, so I change E(L1) to E1. Seems to be just as meaningful and renders nicely by svgbob.

| | L1 | E(L1)
| |----| / \ vote(E(L2))
| | L2 | E(L2) x
| |----| / \ / \ vote(E(L2))
| | L3 | E(L3) x E(L3)' x
| |----| / \ / \ / \ / \ slash(L3)
| | L4 | x x E(L4) x x x x x
V |----| | | | | | | | | vote(E(L4))
V | L5 | xx xx xx E(L5) xx xx xx xx
V +----+ hang on to E(L4) and E(L5) for more...

```

Note that an `E` appearing on 2 branches at the same period is a slashable condition, so a validator observing `E(L3)` and `E(L3)'` can slash L3 and safely choose `x` for that period. Once a validator commits to a branch, other branches can be discarded below that tick count. For any period, validators need only consider a single "has entries" chain or a "ticks only" chain to be proposed by a leader. But multiple virtual entries may overlap as they link back to the a previous period.

#### Time Division

It's useful to consider leader rotation over PoH tick count as time division of the job of encoding state for the network. The following table presents the above tree of forks as a time-divided ledger.

leader period | L1 | L2 | L3 | L4 | L5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, copy the table from book/src/leader-rotation.md. I'll watch for changes until the implementation is done and then delete the RFC, so there's no longer the duplication.

-------|----|----|----|----|----
data | E(L1)| E(L2) | E(L3) | E(L4) | E(L5)
ticks to prev | | | | x | xx

Note that only data from leader L3 will be accepted during leader period L3. Data from L3 may include "catchup" ticks back to a period other than L2 if L3 did not observe L2's data. L4 and L5's transmissions include the "ticks to prev" PoH entries.

This arrangement of the network data streams permits nodes to save exactly this to the ledger for replay, restart, and checkpoints.

### Leader's View

When a new leader begins a period, it must first transmit any PoH (ticks) required to link the new period with the most recently observed and voted period. The branch the leader proposes would link the current period to a previous period that the leader has voted on with virtual ticks.
94 changes: 0 additions & 94 deletions rfcs/0002-consensus.md

This file was deleted.

105 changes: 15 additions & 90 deletions rfcs/0004-leader-rotation.md
Original file line number Diff line number Diff line change
@@ -1,104 +1,29 @@
# Leader Rotation

The goal of this RFC is to define how leader nodes are rotated in Solana, how rotation may cause forks to arise, and how the converges
in response.
The goal of this RFC is to define how leader nodes are rotated in Solana.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More precisely, it's to define how fullnodes rotate in taking the leader role.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garious why not leaders? a leader doesn't need to be a full node per se.


## Leader Seed Generation
## Leader Schedule Generation

Leader selection is decided via a random seed. The process is as follows:
Leader schedule is decided via a predefined seed. The process is as follows:

1. Periodically at a specific `PoH tick count` select the first vote signatures that create a supermajority from the previous voting round.
2. Append them together.
3. Hash the string for `N` counts via a similar process as PoH itself.
4. The resulting hash is the random seed for `M` counts, `M` leader periods, where M > N
1. Periodically at a specific `PoH tick count` use the tick count (simple monotonically increasing counter) as a seed to a stable psudo-random algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we concerned that the leader schedule seeds are all pre-compute-able?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, answered below...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pseudo

2. At that height, compute all the currently staked accounts and their assigned leader identities and weights.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence doesn't make sense to me.

3. Sort them by stake weight.
4. Using the random seed select nodes weighted by stake to create a stake weighted ordering.
5. This ordering becomes valid in `N` `PoH tick counts`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't seem right. Seems like it ought to be something like N counts after the start of the slot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the leader schedule needs to be available for a good while before it comes into effect to reduce forking that might arise because schedules themselves have forked

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally, I'd want to see leader schedules available no later than 32 slots before they come into effect...


## Leader Rotation

1. The leader is chosen via a random seed generated from stake weights and votes (the leader schedule)
2. The leader is rotated every `T` PoH ticks (leader period), accoding to the leader schedule
3. The schedule is applicable for `M` voting rounds

Leader's transmit for a count of `T` PoH ticks. When `T` is reached all the validators should switch to the next scheduled leader. To schedule leaders, the supermajority + `M` nodes are shuffled using the above calculated random seed.

All `T` ticks must be observed from the current leader for that part of PoH to be accepted by the network. If `T` ticks (and any intervening transactions) are not observed, the network optimistically fills in the `T` ticks, and continues with PoH from the next leader.

## Partitions, Forks

Forks can arise at PoH tick counts that correspond to leader rotations, because leader nodes may or may not have observed the previous leader's data. These empty ticks are generated by all nodes in the network at a network-specified rate for hashes/per/tick `Z`.

There are only two possible versions of the PoH during a voting period: PoH with `T` ticks and entries generated by the current leader, or PoH with just ticks. The "just ticks" version of the PoH can be thought of as a virtual ledger, one that all nodes in the network can derive from the last tick in the previous period.

Validators can ignore forks at other points (e.g. from the wrong leader), or slash the leader responsible for the fork.

Validators vote on the longest chain that contains their previous vote, or a longer chain if the lockout on their previous vote has expired.


#### Validator's View

##### Time Progression
The diagram below represents a validator's view of the PoH stream with possible forks over time. L1, L2, etc. are leader periods, and `E`s represent entries from that leader during that leader's period. The 'x's represent ticks only, and time flows downwards in the diagram.


```
time +----+ validator action
| | L1 | E(L1)
| |----| / \ vote(E(L2))
| | L2 | E(L2) x
| |----| / \ / \ vote(E(L2))
| | L3 | E(L3) x E(L3)' x
| |----| / \ / \ / \ / \ slash(L3)
| | L4 | x x E(L4) x x x x x
V |----| | | | | | | | | vote(E(L4))
V | L5 | xx xx xx E(L5) xx xx xx xx
V +----+ hang on to E(L4) and E(L5) for more...
The seed that is selected is predictable but unbiasable. There is no grinding attack to influence its outcome. The set of **staked accounts** and their leader identities is computed over a large period, which is our approach to censorship resistance of the staking set. If at least 1 leader in the schedule is not censoring staking transactions then over a long period of time that leader can ensure that the set of active nodes is not censored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"our approach to censorship resistance" -> "is used to resist censorship". Also, how? Alternatively say that's out of the scope of this document.

Also, typos in the last sentence. Two missing commas and "is" should be "are".

That last sentence should either be dropped (out of scope) or expanded on a bunch.


```

Note that an `E` appearing on 2 branches at the same period is a slashable condition, so a validator observing `E(L3)` and `E(L3)'` can slash L3 and safely choose `x` for that period. Once a validator observes a supermajority vote on any branch, other branches can be discarded below that tick count. For any period, validators need only consider a single "has entries" chain or a "ticks only" chain.

##### Time Division

It's useful to consider leader rotation over PoH tick count as time division of the job of encoding state for the network. The following table presents the above tree of forks as a time-divided ledger.

leader period | L1 | L2 | L3 | L4 | L5
-------|----|----|----|----|----
data | E(L1)| E(L2) | E(L3) | E(L4) | E(L5)
ticks to prev | | | | x | xx

Note that only data from leader L3 will be accepted during leader period L3. Data from L3 may include "catchup" ticks back to a period other than L2 if L3 did not observe L2's data. L4 and L5's transmissions include the "ticks to prev" PoH entries.

This arrangement of the network data streams permits nodes to save exactly this to the ledger for replay, restart, and checkpoints.

#### Leader's View

When a new leader begins a period, it must first transmit any PoH (ticks) required to link the new period with the most recently observed and voted period.


## Examples
## Leader Rotation

### Small Partition
1. Network partition M occurs for 10% of the nodes
2. The larger partition K, with 90% of the stake weight continues to operate as normal
3. M cycles through the ranks until one of them is leader, generating ticks for periods where the leader is in K.
4. M validators observe 10% of the vote pool, finality is not reached.
5. M and K re-connect.
6. M validators cancel their votes on M, which has not reached finality, and re-cast on K (after their vote lockout on M).
* The leader is rotated every `T` PoH ticks (leader period), accoding to the leader schedule

### Leader Timeout
1. Next rank leader node V observes a timeout from current leader A, fills in A's period with virtual ticks and starts sending out entries.
2. Nodes observing both streams keep track of the forks, waiting for:
a. their vote on leader A to expire in order to be able to vote on B
b. a supermajority on A's period
3. If a occurs, leader B's period is filled with ticks, if b occurs, A's period is filled with ticks
4. Partition is resolved just like in the [Small Partition](#small-parition)
Leader's transmit for a count of `T` PoH ticks. When `T` is reached all the validators should switch to the next scheduled leader. Leaders that transmit out of order can be ignored or slashed.

All `T` ticks must be observed from the current leader for that part of PoH to be accepted by the network. If `T` ticks (and any intervening transactions) are not observed, the network optimistically fills in the `T` ticks, and continues with PoH from the next leader. See [branch generation](rfcs/0002-branch_generation.md).

## Network Variables

`M` - number of nodes outside the supermajority to whom leaders broadcast their PoH for validation

`N` - number of voting rounds for which a leader schedule is considered before a new leader schedule is used

`T` - number of PoH ticks per leader period (also voting period)
`N` - Number of voting rounds for which a leader schedule is considered before a new leader schedule is used This number should be large and potentially cover 2 weeks.

`Z` - number of hashes per PoH tick
`T` - number of PoH ticks per leader period.
File renamed without changes.