Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node deletion sometimes infeasible in multiproofs #6

Open
perama-v opened this issue Jun 13, 2023 · 6 comments
Open

Node deletion sometimes infeasible in multiproofs #6

perama-v opened this issue Jun 13, 2023 · 6 comments

Comments

@perama-v
Copy link
Owner

Problem

When executing a block using proof-based state data:

  • IF there is a key present (inclusion proof) that is deleted (converted to exclusion proof)
  • AND the key belongs to a parent branch with exactly 2 children
  • AND the sibling node is an extension
  • AND the grandparent node is an extension
  • THEN there is insufficient information to update the trie to get the new root.

Explanation

There appears to be insufficient information to construct post-block state root.

There are situations where a key is being removed from the trie during block execution. When a leaf node is
one of two children of a branch node, that branch node is now not required.

It can be seen that branch A would only have one child if the leaf were removed.

flowchart TD
    A[some traversal ...] --> B[extension 'abc'] --> F[branch A]
    F --item 1--> C[extension '2345'] --> D[branch B]
    F --item d--> E[leaf being deleted with path '...abcdef']
Loading

The updated trie needs:

  • Branch A deleted
  • Extension nodes combined
flowchart TD
    A[some traversal ...] --> B[extension 'abc12345'] --> D[branch B]
Loading

However, the trie is represented as a merkle multiproof, either for

  • all the accounts accessed in a block or
  • all the storage slots accessed for an account in a block

Nodes not accessed during block execution, such as branch A and extension '2345' are not present.
Here is what is present:

flowchart TD
    A[some traversal ...] --> B[extension 'abc'] --> F[branch A]
    F -.item 1.-> C[hash of extension '2345']
    F --item d--> E[leaf being deleted with path '...abcdef']
Loading

So it is not possible to know the contents of extension '2345' (perhaps it is '2', or '23456'). Hence
the two extensions cannot be combined. This prevents hashes from being known all the way back to the root.

Impact

One could execute a block without trusting a third party, but verification of the post-block state root would not be possible locally.

As blocks are deterministic this represents a loss of a nice sanity check for correct local block execution.

However the results of the EVM execution (full block trace) can still be obtained completely.

@carver
Copy link

carver commented Jun 13, 2023

Yup, this is an tricky corner case of updating the state root (I ran into it while building Beam Sync). The solution we had is that when building the proof for a block, to add the trie nodes accessed during the process of building the final state root. (For example, Branch B here)

@perama-v
Copy link
Owner Author

Ok thanks that's helpful. I'll see if there is a good way to map that solution to this problem

One idea is that for all keys that have inclusion proof that becomes an exclusion proof, include
the exclusion proof from the subsequent block.

Current process

When assembling the proof for block n:

  • Get pre-block state (debug_traceBlock for block n - 1 with prestateTracer)
  • For every key accessed, call eth_getProof against block n - 1
  • Combine results for block n - 1 in to a multiproof
  • Execute block n, editing the multiproof as states are altered
    • Problem: When a key is removed and a sibling node needs to be modified

Possible process

When assembling the proof for block n:

  • Get pre-block state (debug_traceBlock for block n - 1 with prestateTracer)
  • For every key accessed, call eth_getProof against
    • block n - 1
    • 💡 block n, and if a key inclusion proof has become an exclusion proof, retain the proof.
  • Combine results for block n - 1 in to a multiproof
  • Execute block n, editing the multiproof as states are altered
    • 💡When a key is removed and a sibling node needs to be modified, the sibling can be retrieved
      from the exclusion proof provided in proofs for block n.
      • If the exclusion proof has a node that matches the required node, cascade hashes up to the root.
      • If the exclusion proof does not match, some edit is pending later in the block. Wait until all the transactions have been completed, then look for the match.
flowchart TD
    A[some traversal in block n...] --> B[extension 'abc12345'] --> D[branch B]
    B -.-> E[deleted key exclusion proof for path '...abcdef']
Loading

This effectively makes available the preimage for the sibling node extension '2345'.

flowchart TD
    A[traversal in multiproof from block n - 1 being updated...] --> B[extension 'abc'] --> F[branch A]
    F --item 1--> C[*Now known* extension '2345'] --> D[branch B]
    F --item d--> E[leaf being deleted with path '...abcdef']
Loading

Potentially some different tree structure scenarios to think about and check to see if this
has any problems.

@perama-v
Copy link
Owner Author

perama-v commented Jun 17, 2023

Patterns explored and cases that require knowledge of a node where only the keccak(node) is present
are as follows:

  • The leaf being removed has only one sibling node
  • The sibling node is an extension either an extension or a leaf

In a trio (grandparent-parent-sibling):

  • EBE & BBE: Additional data required.
    • EBE -> E
    • BBE -> BE
  • EBB & BBB: No additional data required.
    • EBE -> EB
    • BBB -> BEB

Diagrams: 22c9a1e

@perama-v
Copy link
Owner Author

Erratum: When the sibling is a leaf this also requires knowledge of that node.

This is because the leaf path (part of the node) will be edited to include the path nibble that was part of the deleted parent.

@perama-v
Copy link
Owner Author

The process of fetching nodes from the subsequent block appears infeasible in some cases, as follows:

Before:

  • Grandparent (branch)
    • Parent (branch, for deletion)
      • Leaf (removed_key being removed from trie)
      • Sibling leaf (at some index/nibble in parent)

After:

  • Grandparent (branch)
    - Sibling leaf, with nibble appended to path

The sibling leaf node is required for editing in order to add to its path. The method to call eth_getProof(removed_key)
in the subsequent block gets an exclusion proof for the removed leaf. However, the terminal node in that
exclusion proof will be the grandparent. Thus, the sibling leaf remains unknown.

Conclusion

  1. The internal node could be obtained from the node via non-rpc methods (direct DB access)
  2. The goal of obtaining the post-block hash could be abandoned. The purpose is for an assurance that the block trace was executed correctly. Alternatives:
    • Create test cases where the post-block state values are recorded in a simple map (rather than a multiproof) and then call eth_getProof for those keys in the subsequent block. Verify they match.
    • Trace a sample of blocks and verify they are equal to the debug_traceBlock EIP-3155 compatible output of an archive node.

@perama-v
Copy link
Owner Author

Conclusion

  1. The internal node could be obtained from the node via non-rpc methods (direct DB access)

  2. The goal of obtaining the post-block hash could be abandoned. The purpose is for an assurance that the block trace was executed correctly. Alternatives:

    • Create test cases where the post-block state values are recorded in a simple map (rather than a multiproof) and then call eth_getProof for those keys in the subsequent block. Verify they match.
    • Trace a sample of blocks and verify they are equal to the debug_traceBlock EIP-3155 compatible output of an archive node.

Have not given this extensive thought, but I think there is another option:
3. Detect removed keys by calling for proofs on the subsequent block, and including those in the distributed data.

Algorithm:

  1. Obtain additional trie nodes for removed keys.
  2. Call eth_getProof again for each state key, this time for the block of interest (block n).
  3. For any key that has an inclusion proof in n - 1, and an exclusion proof in block n, retain the proof for that key.
  4. Store these additional exclusion proofs in the RequiredBlockState data structure.
  5. When a user re-executes the block, if a key is removed in a way that requires additional knowledge
    of trie nodes to complete the trie, they can be sourced from the additional exclusion proofs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants