Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Inner hashing of value in state trie (chainspec versioning). #8931

Closed
wants to merge 142 commits into from

Conversation

cheme
Copy link
Contributor

@cheme cheme commented May 28, 2021

This PR brings alternate hashing of value in substrate.

It uses internally and depends on paritytech/trie#134.

Basically values in trie node got hashed a first time when they are bigger than a given threshold, such as:

  • No threshold or value size less than threshold:
    hash_leaf = hash(node_header ++ key_partial ++ encoded_value)
  • alt hashing value size >= threshold
    hash_leaf = hash(node_header ++ key_partial ++ hash(encoded_value))

and same thing for branches with value.

To avoid migrating data, the alt hashing uses their own node header.

Note that size threshold is allowed to change and is only use when calculating trie root (triedbmut usage) in
order to possibly change type of hashing of modified and new nodes.

Node Headers for alt_hashing are using unused range of values of previous headers (and possibly expands to an additional byte),
making this backward compatible.

The threshold is define as a value in trie state at :trie_hashing_conf as a compact u32 value.
So if a trie state does contain this value it will not be backward compatible (no reason to be the case).

Resolution of the threshold is done against StateVersion defined in chain spec.

Remaining TODOs

  • codec of trie proof need rework

polkadot branch: https://github.com/cheme/polkadot-1/tree/state-update4
cumulus branch: https://github.com/cheme/cumulus/tree/state-update4

trie_nodes: Vec<Vec<u8>>,
// TODO decode no more bytes to V0 for compatibility and remove pub(crate)
pub(crate) trie_nodes: Vec<Vec<u8>>,
pub(crate) state_version: StateVersion,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes proof format by adding state version.
Unresolve question, what is better:

  • decode to V0 if state version not encoded, so previous code is guaranted unbroken.
  • decode to V1 default if state version is not encoded, so it takes less size, but need coordinated update.
    (for compatibility I can use a two byte header in the range of compact value encoding that does not exist by using U8_OUT_OF_RANGE_VALUE for state version)

@arkpar
Copy link
Member

arkpar commented Sep 2, 2021

An update on some design decisions after an internal discussion:

  1. Switching to a new trie format by the runtime is best done by versioning the state_root function. At some point runtime starts calling state_root_v2 instead of state_root and all the new trie nodes are generated in the new format. This is much simpler than adding transition blocks in the chain spec or analyzing digest items in the client.
  2. Lazy migration of existing trie nodes will still be required to avoid blowing up PoV size for parachains if we want to keep existing safety guarantees during the migration (i.e. not require trusting the collators on the new state root).
  3. Therefore migration will be controlled by the runtime. The simplest way is to allow runtime to touch a sequence of keys in the block, so that they are re-inserted into the trie with the new node format.

@cheme
Copy link
Contributor Author

cheme commented Sep 2, 2021

About point 1. I have some concern (more work than expected see #8931 (comment) last paragraph).

@cheme
Copy link
Contributor Author

cheme commented Sep 8, 2021

I pushed a bit in the direction of using new host function in master...cheme:state-update4-host2 , still need to test, but looks promising , so part of this PR with state_versioning may be removed soon.
(the switch is made using new root host function and getting the version from RuntimeVersion (when core api version > 3), so core api version need update too).

@cheme cheme added A3-in_progress Pull request is in progress. No review needed at this stage. and removed A0-please_review Pull request needs code review. labels Sep 9, 2021
@cheme cheme changed the title Inner hashing of value in state trie. Inner hashing of value in state trie (chainspec versioning). Sep 9, 2021
@cheme
Copy link
Contributor Author

cheme commented Sep 10, 2021

Closing in favor of #9732

@cheme cheme closed this Sep 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-in_progress Pull request is in progress. No review needed at this stage. C1-low PR touches the given topic and has a low impact on builders. D2-breaksapi
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants