Node deletion sometimes infeasible in multiproofs #6

perama-v · 2023-06-13T02:35:00Z

Problem

When executing a block using proof-based state data:

IF there is a key present (inclusion proof) that is deleted (converted to exclusion proof)
AND the key belongs to a parent branch with exactly 2 children
AND the sibling node is an extension
AND the grandparent node is an extension
THEN there is insufficient information to update the trie to get the new root.

Explanation

There appears to be insufficient information to construct post-block state root.

There are situations where a key is being removed from the trie during block execution. When a leaf node is
one of two children of a branch node, that branch node is now not required.

It can be seen that branch A would only have one child if the leaf were removed.

flowchart TD
    A[some traversal ...] --> B[extension 'abc'] --> F[branch A]
    F --item 1--> C[extension '2345'] --> D[branch B]
    F --item d--> E[leaf being deleted with path '...abcdef']

The updated trie needs:

Branch A deleted
Extension nodes combined

flowchart TD
    A[some traversal ...] --> B[extension 'abc12345'] --> D[branch B]

However, the trie is represented as a merkle multiproof, either for

all the accounts accessed in a block or
all the storage slots accessed for an account in a block

Nodes not accessed during block execution, such as branch A and extension '2345' are not present.
Here is what is present:

flowchart TD
    A[some traversal ...] --> B[extension 'abc'] --> F[branch A]
    F -.item 1.-> C[hash of extension '2345']
    F --item d--> E[leaf being deleted with path '...abcdef']

So it is not possible to know the contents of extension '2345' (perhaps it is '2', or '23456'). Hence
the two extensions cannot be combined. This prevents hashes from being known all the way back to the root.

Impact

One could execute a block without trusting a third party, but verification of the post-block state root would not be possible locally.

As blocks are deterministic this represents a loss of a nice sanity check for correct local block execution.

However the results of the EVM execution (full block trace) can still be obtained completely.

carver · 2023-06-13T19:04:06Z

Yup, this is an tricky corner case of updating the state root (I ran into it while building Beam Sync). The solution we had is that when building the proof for a block, to add the trie nodes accessed during the process of building the final state root. (For example, Branch B here)

perama-v · 2023-06-16T08:12:20Z

Ok thanks that's helpful. I'll see if there is a good way to map that solution to this problem

One idea is that for all keys that have inclusion proof that becomes an exclusion proof, include
the exclusion proof from the subsequent block.

Current process

When assembling the proof for block n:

Get pre-block state (debug_traceBlock for block n - 1 with prestateTracer)
For every key accessed, call eth_getProof against block n - 1
Combine results for block n - 1 in to a multiproof
Execute block n, editing the multiproof as states are altered
- Problem: When a key is removed and a sibling node needs to be modified

Possible process

When assembling the proof for block n:

Get pre-block state (debug_traceBlock for block n - 1 with prestateTracer)
For every key accessed, call eth_getProof against
- block n - 1
- 💡 block n, and if a key inclusion proof has become an exclusion proof, retain the proof.
Combine results for block n - 1 in to a multiproof
Execute block n, editing the multiproof as states are altered
- 💡When a key is removed and a sibling node needs to be modified, the sibling can be retrieved
  from the exclusion proof provided in proofs for block n.
  - If the exclusion proof has a node that matches the required node, cascade hashes up to the root.
  - If the exclusion proof does not match, some edit is pending later in the block. Wait until all the transactions have been completed, then look for the match.

flowchart TD
    A[some traversal in block n...] --> B[extension 'abc12345'] --> D[branch B]
    B -.-> E[deleted key exclusion proof for path '...abcdef']

This effectively makes available the preimage for the sibling node extension '2345'.

flowchart TD
    A[traversal in multiproof from block n - 1 being updated...] --> B[extension 'abc'] --> F[branch A]
    F --item 1--> C[*Now known* extension '2345'] --> D[branch B]
    F --item d--> E[leaf being deleted with path '...abcdef']

Potentially some different tree structure scenarios to think about and check to see if this
has any problems.

perama-v · 2023-06-17T04:41:21Z

Patterns explored and cases that require knowledge of a node where only the keccak(node) is present
are as follows:

The leaf being removed has only one sibling node
The sibling node is ~~an extension~~ either an extension or a leaf

In a trio (grandparent-parent-sibling):

EBE & BBE: Additional data required.
- EBE -> E
- BBE -> BE
EBB & BBB: No additional data required.
- EBE -> EB
- BBB -> BEB

Diagrams: 22c9a1e

perama-v · 2023-06-19T05:16:38Z

Erratum: When the sibling is a leaf this also requires knowledge of that node.

This is because the leaf path (part of the node) will be edited to include the path nibble that was part of the deleted parent.

perama-v · 2023-06-20T05:31:07Z

The process of fetching nodes from the subsequent block appears infeasible in some cases, as follows:

Before:

Grandparent (branch)
- Parent (branch, for deletion)
  - Leaf (removed_key being removed from trie)
  - Sibling leaf (at some index/nibble in parent)

After:

Grandparent (branch)
- Sibling leaf, with nibble appended to path

The sibling leaf node is required for editing in order to add to its path. The method to call eth_getProof(removed_key)
in the subsequent block gets an exclusion proof for the removed leaf. However, the terminal node in that
exclusion proof will be the grandparent. Thus, the sibling leaf remains unknown.

Conclusion

The internal node could be obtained from the node via non-rpc methods (direct DB access)
The goal of obtaining the post-block hash could be abandoned. The purpose is for an assurance that the block trace was executed correctly. Alternatives:
- Create test cases where the post-block state values are recorded in a simple map (rather than a multiproof) and then call eth_getProof for those keys in the subsequent block. Verify they match.
- Trace a sample of blocks and verify they are equal to the debug_traceBlock EIP-3155 compatible output of an archive node.

perama-v · 2023-08-29T09:07:35Z

Conclusion

The internal node could be obtained from the node via non-rpc methods (direct DB access)

The goal of obtaining the post-block hash could be abandoned. The purpose is for an assurance that the block trace was executed correctly. Alternatives:

Create test cases where the post-block state values are recorded in a simple map (rather than a multiproof) and then call eth_getProof for those keys in the subsequent block. Verify they match.

Trace a sample of blocks and verify they are equal to the debug_traceBlock EIP-3155 compatible output of an archive node.

Have not given this extensive thought, but I think there is another option:
3. Detect removed keys by calling for proofs on the subsequent block, and including those in the distributed data.

Algorithm:

Obtain additional trie nodes for removed keys.
Call eth_getProof again for each state key, this time for the block of interest (block n).
For any key that has an inclusion proof in n - 1, and an exclusion proof in block n, retain the proof for that key.
Store these additional exclusion proofs in the RequiredBlockState data structure.
When a user re-executes the block, if a key is removed in a way that requires additional knowledge
of trie nodes to complete the trie, they can be sourced from the additional exclusion proofs.

perama-v mentioned this issue Jun 13, 2023

Validate post-block state root #4

Open

perama-v mentioned this issue Jun 22, 2023

Post tx root #5

Merged

perama-v mentioned this issue Aug 10, 2023

Add eth_getRequiredBlockState method ethereum/execution-apis#455

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node deletion sometimes infeasible in multiproofs #6

Node deletion sometimes infeasible in multiproofs #6

perama-v commented Jun 13, 2023

carver commented Jun 13, 2023

perama-v commented Jun 16, 2023

perama-v commented Jun 17, 2023 •

edited

Loading

perama-v commented Jun 19, 2023

perama-v commented Jun 20, 2023

perama-v commented Aug 29, 2023

Conclusion

Node deletion sometimes infeasible in multiproofs #6

Node deletion sometimes infeasible in multiproofs #6

Comments

perama-v commented Jun 13, 2023

Problem

Explanation

Impact

carver commented Jun 13, 2023

perama-v commented Jun 16, 2023

Current process

Possible process

perama-v commented Jun 17, 2023 • edited Loading

perama-v commented Jun 19, 2023

perama-v commented Jun 20, 2023

Conclusion

perama-v commented Aug 29, 2023

Conclusion

perama-v commented Jun 17, 2023 •

edited

Loading