Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vision: Host functions and database support for non-Merklised-persistent data structures #245

Open
gavofyork opened this issue Oct 28, 2022 · 4 comments
Assignees
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@gavofyork
Copy link
Member

gavofyork commented Oct 28, 2022

Related: #278
Related: #359

NOTE: This is intended to evolve into a full RFC, but is at present more of a bare collection of notes.

New overlay items:

  • Vec<u8>, keyed by name: Name.
  • BTreeMap<Vec<u8>, Vec<u8>>, keyed by name: Name.

These obey transactional principles. They may, at the direction of the runtime through a host function, be archived in a regular, non-Merklised database for later off-chain querying. An RPC may be provided in order to query the contents (if stored).

There are two sets of new host functions for creating, manipulating and querying these items; one set for the Vec<_>s and one for the BTreeMap<_>s.

The set for the Vec<_>s are prefixed blob_ and are:

  • blob_new(name: Name, mode: Mode)
  • blob_set(name: Name, value: &[u8])
  • blob_clone(name: Name, target_name: Name)
  • blob_rename(name: Name, target_name: Name)
  • blob_edit(name: Name, data: &[u8], offset: u32) -> u32
  • blob_append(name: Name, suffix: &[u8])
  • blob_exists(name: Name) -> bool
  • blob_get(name: Name) -> Option<Vec<u8>>
  • blob_len(name: Name) -> Option<u32>
  • blob_hash32(name: Name, algorithm: Hash32Algorithm) -> Option<[u8; 32]>
  • blob_delete(name: Name)

The set for the BTreeMap<_>s are prefixed map_ and are:

  • map_new(name: Name, mode: Mode): Creates a new map name with no items. It is cleared if it already exists.
  • map_clone(name: Name, target_name: Name): Creates a clone of the map name with name target_name.
  • map_rename(name: Name, target_name: Name): Alters the name of map name to target_name.
  • map_insert(name: Name, key: &[u8], value: &[u8]): Inserts a single (key, value) pair into map name, creating the map if it did not previously exist and overwriting the item if it did.
  • map_remove_item(name: Name, key: &[u8]): Removes the pair with the given key from the map name, if the map exists and contains the item. Does nothing otherwise.
  • map_exists(name: Name, key: &[u8]) -> bool: Returns true iff the map name is present in execution state.
  • map_contains(name: Name, key: &[u8]) -> Option<bool>: Returns Some iff the map name exists, None otherwise. The inner value of Some is true iff the map name contains key.
  • map_item_get(name: Name, key: &[u8]) -> Option<Vec<u8>>: Returns Some iff the map name exists and contains key. If so, the inner value is that associated with key.
  • map_item_len(name: Name, key: &[u8]) -> Option<u32>: Returns Some iff the map name exists and contains key. If so, the inner value is the length of the value associated with key.
  • map_item_hash32(name: Name, key: &[u8], algorithm: Hash32Algorithm) -> Option<[u8; 32]>: Returns Some iff the map name exists and contains key. If so, the inner value is the value associated with key when hashed with algorithm.
  • map_count(name: Name) -> Option<u32>: Returns Some iff the map name exists, None otherwise. If Some, then the inner value is the number of items in the map name.
  • map_root32(name: Name, structure: Root32Structure) -> Option<[u8; 32]>: Calculates and returns the root of the data structure structure containing the items held in the map name. Returns None if map name does not exist.
  • map_dump(name: Name) -> Option<Vec<(Vec<u8>, Vec<u8>)>>: Returns Some of a Vec of all items in the map name, sorted; or None if the map name does not exist.
  • map_dump_hashed(name: Name, algorithm: Hash32Algorithm) -> Option<Vec<([u8; 32], [u8; 32])>>: Returns Some of a Vec of all pairs of keys and values in the map name hashed with algorithm and in order of the (unhashed) key; or None if the map name does not exist.
  • map_next_key(name: Name, key: &[u8], count: u32) -> Option<Vec<Vec<u8>>>: Returns up to the next count keys in map name immediately following key. If fewer items exist after key than count, then only the remaining items are returned. If the map name does not exist then None is returned.
  • map_delete(name: Name): Delete the map name, clearing all data.

There exists Hash32Algorithm and Root32Structure which should ultimately be defined in separate RFCs.

Mode is defined as:

enum Mode {
    #[codec(index = 0)]
    Drop,
    #[codec(index = 1)]
    Archive,
}
  • Drop: The data may be discarded at the end of block execution without possibility of later querying, on-chain or off-chain. If creation of an item occurs without an explicit mode being given, then the default mode ThrowAway is assumed.
  • Archive: The data should be retained in an database associated with the block which is executing. It will not be accessible on-chain in later blocks (unless e.g. oraclised in some way).

It is intended that future RFCs introduce additional items to this as needed.

Runtime interfaces may exist allowing the runtime to generate proofs for light-clients.

These items can be used to ween ourselves from (mis-)using the Main State Trie, especially for large items which should never be in the PoV.

This functionality can be used to remove system events and code (:CODE) from the main state trie, avoiding possible PoV issues and the problems with storing events as a Vec<Event> (effectively concatenating encoded items). It allows for custom indexing and proof systems, improving efficiency and efficacy.

The host function map_dump_hashed allows for arbitrary digest ("Merkle hash") calculation within the runtime, allowing e.g. indexing by event topic and for proofs to use a base-2 Merkle structure. Multiple roots could be calculated to optimise for multiple indexing systems.

The host function blob_hash32 allows for comparison and retrieval of large blobs, e.g. of runtime code. The location of the "actual" code could even be moved outside of the main state into one of these items.

Storage values and maps may be marked as non-persistent and these host functions may be used, resulting in better performance as well as guaranteed non-persistence. This is useful for block-specific data (like block height), contextual information (e.g. transaction index) as well safe "thread-local" storage.

Future Work

The Mode may be extended to include the possibility of persisting the regular (non-Merklised) key/value database between blocks. Additional configuration may allow a map's keys to be of fixed length.

@cheme
Copy link
Contributor

cheme commented Oct 28, 2022

Small question, I wonder if for the blob api we could address a given position, something like:
blob_set_at(name: [u8; 32], data: &[u8], offset: u32) to avoid needing to pass the whole data in memory.
Could be true for access too (so we can stream content in memory efficient way).
Also could make mmr proof or similar merklized compact structure over indexed content (the blob) directly usable with the api.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/new-host-functions-for-non-merklised-optionally-persisted-data-types-substrate-12577/922/1

@burdges
Copy link

burdges commented Nov 8, 2022

As an aside, blake2 has a tree-hashing mode. It seemingly tuns a blob: Vec<u8> into a Merklised persistent data structure, with 32*log(blob.len()/32) reads and writes, but inefficient insertions, deletions, and moves.

Amusingly, I'd expect this gives smaller Merkle proofs than using our trie directly, given our radix 16 trie bloats PoVs by 4x in dense trees, but even more size efficient options exist for typical Merkle proofs.

@cheme
Copy link
Contributor

cheme commented Nov 8, 2022

I remember seeing different hash function using tree structure, but did not do my homework yet, remember also blake3 using it by default. Also think those kind of hash function are good candidates for blob_hash32 (but maybe just an mmr structure is easier to garantie some good binary change performance).
Edit: blake3 looks like a good candidate to me, https://github.com/BLAKE3-team/BLAKE3/blob/master/src/guts.rs can make it rather simple to keep the tree of hash in memory (not sure of the size for keeping it on disk) and allowing quick computation for random incremental change. Would need testing though :)

@kianenigma kianenigma changed the title Host functions and database support for non-Merklised-persistent data structures Vision: Host functions and database support for non-Merklised-persistent data structures Mar 10, 2023
@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed J0-enhancement labels Aug 25, 2023
github-merge-queue bot pushed a commit that referenced this issue Nov 5, 2024
This PR updates litep2p to the latest release.

- `KademliaEvent::PutRecordSucess` is renamed to fix word typo
- `KademliaEvent::GetProvidersSuccess` and
`KademliaEvent::IncomingProvider` are needed for bootnodes on DHT work
and will be utilized later


### Added

- kad: Providers part 8: unit, e2e, and `libp2p` conformance tests
([#258](paritytech/litep2p#258))
- kad: Providers part 7: better types and public API, public addresses &
known providers ([#246](paritytech/litep2p#246))
- kad: Providers part 6: stop providing
([#245](paritytech/litep2p#245))
- kad: Providers part 5: `GET_PROVIDERS` query
([#236](paritytech/litep2p#236))
- kad: Providers part 4: refresh local providers
([#235](paritytech/litep2p#235))
- kad: Providers part 3: publish provider records (start providing)
([#234](paritytech/litep2p#234))

### Changed

- transport_service: Improve connection stability by downgrading
connections on substream inactivity
([#260](paritytech/litep2p#260))
- transport: Abort canceled dial attempts for TCP, WebSocket and Quic
([#255](paritytech/litep2p#255))
- kad/executor: Add timeout for writting frames
([#277](paritytech/litep2p#277))
- kad: Avoid cloning the `KademliaMessage` and use reference for
`RoutingTable::closest`
([#233](paritytech/litep2p#233))
- peer_state: Robust state machine transitions
([#251](paritytech/litep2p#251))
- address_store: Improve address tracking and add eviction algorithm
([#250](paritytech/litep2p#250))
- kad: Remove unused serde cfg
([#262](paritytech/litep2p#262))
- req-resp: Refactor to move functionality to dedicated methods
([#244](paritytech/litep2p#244))
- transport_service: Improve logs and move code from tokio::select macro
([#254](paritytech/litep2p#254))

### Fixed

- tcp/websocket/quic: Fix cancel memory leak
([#272](paritytech/litep2p#272))
- transport: Fix pending dials memory leak
([#271](paritytech/litep2p#271))
- ping: Fix memory leak of unremoved `pending_opens`
([#274](paritytech/litep2p#274))
- identify: Fix memory leak of unused `pending_opens`
([#273](paritytech/litep2p#273))
- kad: Fix not retrieving local records
([#221](paritytech/litep2p#221))

See release changelog for more details:
https://github.com/paritytech/litep2p/releases/tag/v0.8.0

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Draft
Status: Backlog
Development

No branches or pull requests

7 participants