Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chain State Design #682

Closed
1 task
teor2345 opened this issue Jul 18, 2020 · 6 comments
Closed
1 task

Chain State Design #682

teor2345 opened this issue Jul 18, 2020 · 6 comments
Assignees
Labels
A-consensus Area: Consensus rule updates C-design Category: Software design work S-needs-design Status: Needs a design decision

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jul 18, 2020

Update: See #763 for the latest design.

This description is outdated:

Our chain state data structures need to support verification, and chain reorganisation.

Here's one possible design, which uses immutable, (partially) shared data structures:

Zebra holds the chain states for the last ~100 blocks in memory. (The zcashd reorg limit is 100 blocks behind the current best tip.) That way, chain reorgs are quick and reliable. Take the chain state of the fork root block, update the chain state using the alternate child block, and change the best tip.

We delay writing the chain state to disk until we know it's permanent. That way, reorgs won't touch the disk. (And we never write orphaned or redundant chain states to disk.)

Here's a summary of the overall design:

  • Memory contains a tree of recent verified blocks, and each block's state.
  • Disk has a linear chain of older blocks by height, and the state for the latest linear block.
  • Disk also has a pool of recent blocks, indexed by hash (not height, because recent blocks can have duplicate heights)

Relaunches should be quick, because we only need to replay the last 100 blocks on top of the persistent state. Plus whatever new blocks we get from the network.

There are a few different kinds of disk updates:

  • add a block to the set of blocks on disk (by hash)
  • append a block hash to the chain on disk (by height)
  • add or remove a hash from the set of recent blocks (add by hash, remove by height)
  • update the current chain state atomically
    • most state changes will modify or delete entries, but some of the state is append-only

Open Questions:

  • which module is responsible for the chain state?
    • the state data structures and basic operations can go in zebra-state
    • zebra-consensus makes the decision about the best tip, and when to write state to disk
@teor2345 teor2345 added Poll::Ready C-design Category: Software design work A-consensus Area: Consensus rule updates S-needs-design Status: Needs a design decision labels Jul 18, 2020
@teor2345 teor2345 self-assigned this Jul 18, 2020
@yaahc
Copy link
Contributor

yaahc commented Jul 18, 2020

Take the chain state of the fork root block, update the chain state using the alternate child block, and change the best tip.

Memory contains a tree of recent verified blocks, and each block's state.

What do we mean by "state" in these lines?

@yaahc
Copy link
Contributor

yaahc commented Jul 18, 2020

I'm not sure I'm understanding the design for the in memory chain state so let me try describing my understanding of how I figured it would work and we can figure out how close it is to your design.

Assuming the state service is storing the last 100 heights worth of blocks in memory and not commiting them to the disk here's how I'd expect it to be represented and modified.

  • state service stores a vec of "chains"
  • each chain is an im ordered map where the key = the height and the value is the block (or maybe the blocks hash)
  • each block add will find the chain where the parent block lives, how that search is done doesn't matter because the number and length of the chains is likely small
    • if the ancestor is the tip of its chain the block is just appended to that chain
    • if the ancestor is not the tip of its chain then the chain up to and including that ancestor is cloned and inserted to the vec of chains as a new chain and the new block is appended to that new chain.
  • after each update we sort the vec of chains by height
  • after the sort we take the first block of the longest chain, if it is now over 100 blocks long, and commit that to the state layer
  • we then pop that block from the head of all other chains and discard any chains that don't start with that block.

@teor2345
Copy link
Contributor Author

What do we mean by "state" in these lines?

The info we need to verify the next block and all its transactions.

And any info we need to do any other tasks, such as wallet support. (In later releases.)

@yaahc
Copy link
Contributor

yaahc commented Jul 19, 2020

okay but you say the "the chain state for each block" , so does that mean just "the entire chain that leads up to a given block"

@teor2345
Copy link
Contributor Author

teor2345 commented Jul 19, 2020

You described the kind of design I was thinking of. But there are a few tweaks that make the algorithm and data structures a bit simpler.

  • each chain is an im ordered map where the key = the height and the value is the block (or maybe the blocks hash)

I don't think we need to explicitly track heights or lists of blocks.

We can key blocks by hash, because the parent hash field points to a unique block. This map is what I called the "block pool".

(If we subtract one from the height to find the parent height, there can be multiple blocks at that height. But we just want the unique parent block. And we don't need to check the child block's height at this stage, because it has already been checked as part of verification.)

The map's state value contains whatever information we need to verify a child block. For example, the unspent transaction outputs and note commitment tree.

We might also keep a copy of the data we use to calculate the time and difficulty medians, but we could also iterate back through the parent blocks for that info. (Unlike the transaction state, the depth of the medians is limited.)

  • each block add will find the chain where the parent block lives, how that search is done doesn't matter because the number and length of the chains is likely small
    • if the ancestor is the tip of its chain the block is just appended to that chain
    • if the ancestor is not the tip of its chain then the chain up to and including that ancestor is cloned and inserted to the vec of chains as a new chain and the new block is appended to that new chain.

A block add is the same operation, regardless of the chain structure:

  1. Find the parent block's state
  2. Clone that state
  3. Update the cloned state based on the content of the new block
  4. Add the new block to the block pool

Since forks can happen in the middle of a chain, we need to keep state for every block that could potentially fork. (The zcashd reorg limit is 100 blocks, blocks can't fork after that limit.)

  • after each update we sort the vec of chains by height
  • after the sort we take the first block of the longest chain, if it is now over 100 blocks long, and commit that to the state layer
  • we then pop that block from the head of all other chains and discard any chains that don't start with that block.

The main chain consensus rule is "the chain with the greatest proof of work, breaking ties with the block this node received first". It's usually the longest chain, but that's not guaranteed.

After every update, we modify the set of chain tips, which is a standard mutable set data structure. We also keep a best tip. We can track tips using Arc references to their chain states.

Here's how updates work:

  1. If the new block is a child of an existing tip, replace that old tip with this new tip
  2. Otherwise, add a new tip to the state
  3. If any tip has more proof of work than the current tip, update the best tip.

Requiring a strictly greater proof of work breaks ties using the first block that the node verified. Since the tie-breaker is non-deterministic, we can use "verified time" rather than "downloaded time", and still converge with other zcashd and zebrad instances. (The ordering will almost always be the same, except if the earlier block is very slow to verify.)

The zcashd pruning rule is "288 blocks behind the best tip". But I think zebrad can prune much earlier, because we can just go fetch the blocks from disk as needed. (And we don't need to fetch the state from disk.)

Here's how pruning old block states works:

  1. Find all in-memory block states that are at least 100 blocks behind the new tip. Let's call them "mature states".
  2. Find the mature states in the main chain, and add their block hashes to the on-disk height-based chain index.
  3. Find the highest mature state in the main chain, and write it to disk as the on-disk state.
  4. Remove all the mature states from the in-memory chain tips and block pool. This drops old side-chain and main chain states.

After a chain reorg, there could be zero, one, or many mature blocks. Some may be child blocks of other mature blocks. So the algorithm has to be generic.

teor2345 added a commit to teor2345/zebra that referenced this issue Jul 27, 2020
We can use this network upgrade to implement different consensus rules
and chain context handling for genesis blocks.

Part of the chain state design in ZcashFoundation#682.
dconnolly pushed a commit that referenced this issue Jul 27, 2020
We can use this network upgrade to implement different consensus rules
and chain context handling for genesis blocks.

Part of the chain state design in #682.
@teor2345
Copy link
Contributor Author

teor2345 commented Aug 7, 2020

We're doing this through the RFC process now.

@teor2345 teor2345 closed this as completed Aug 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-consensus Area: Consensus rule updates C-design Category: Software design work S-needs-design Status: Needs a design decision
Projects
None yet
Development

No branches or pull requests

2 participants