Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transient storage runtime host function #359

Open
gavofyork opened this issue Mar 25, 2020 · 32 comments
Open

Transient storage runtime host function #359

gavofyork opened this issue Mar 25, 2020 · 32 comments
Labels
D1-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@gavofyork
Copy link
Member

Right now we abuse storage for intra-block data such as block number, parent hash and block author as well as various housekeeping information and flags like whether we set the uncles Authorship::DidSetUncles.

When initially writing, this incurs an extra trie lookup, which is slow. Instead there should be another host API, which works exactly like set_storage/get_storage but has no trie backing, so it never tries to lookup the value in the trie, nor does it write the value at the end of the block.

@gavofyork gavofyork added Z2-medium I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. labels Mar 25, 2020
@NikVolf NikVolf self-assigned this Mar 25, 2020
@NikVolf NikVolf closed this as completed Apr 3, 2020
@NikVolf NikVolf reopened this Apr 3, 2020
@NikVolf
Copy link
Contributor

NikVolf commented Apr 3, 2020

So I've seen potential users of this api:

  • in authorship (Author: Option<AccountId>, DidSetUncles: bool)
  • in babe (Initialized: Option<schnorrkel::RawVRFOutput>)
  • contracts (GasSpent: u64)
  • finality-tracker (Update: BlockNumber)
  • timestamp (DidUpdate: bool)

we also need to change decl_storge! and add "transient" option for code generation, since other modules do readings of those values

@bkchr
Copy link
Member

bkchr commented Apr 3, 2020

Everything in system that is removed in on_finalize.

@NikVolf
Copy link
Contributor

NikVolf commented Apr 7, 2020

@bkchr in finalize and initialize (events)

@bkchr
Copy link
Member

bkchr commented Apr 7, 2020

No, events are deleted in the next block and need to stay in storage to be inspectable.

@NikVolf
Copy link
Contributor

NikVolf commented Apr 7, 2020

But isn't it a hack?
If the lifetime of transient storage data is extended by client or whatever is using event data, this might not be required?

@bkchr
Copy link
Member

bkchr commented Apr 7, 2020

I would call it sort of hack. It would need to be stored in the database anyway to make it accessible for the UI. So, any change here probably just requires a ton of changes to make it compatible.

@NikVolf
Copy link
Contributor

NikVolf commented Apr 8, 2020

I wonder if we can avoid calls into the host at all and just keep everything in the runtime memory

@bkchr
Copy link
Member

bkchr commented Apr 8, 2020

No. Memory is resetted between calls.

@NikVolf
Copy link
Contributor

NikVolf commented Apr 8, 2020

But for block execution it does not matter?

@bkchr
Copy link
Member

bkchr commented Apr 8, 2020

But for Block production.

@kianenigma
Copy link
Contributor

So I've seen potential users of this api:

  • in authorship (Author: Option<AccountId>, DidSetUncles: bool)
  • in babe (Initialized: Option<schnorrkel::RawVRFOutput>)
  • contracts (GasSpent: u64)
  • finality-tracker (Update: BlockNumber)
  • timestamp (DidUpdate: bool)

we also need to change decl_storge! and add "transient" option for code generation, since other modules do readings of those values

  • transaction_payment::NextWeightMultiplier.

(cc @shawntabrizi this is very similar to an idea that you also had)

@shawntabrizi
Copy link
Member

My idea was to have some hook in executive/system that reads values from storage (like block number, block author, etc..) and places them in memory and allows all pallets quick access to them without incurring any storage costs.

There should additionally be hooks so that any pallet may introduce some data to be added to this hook, so if someone writes a "system critical pallet" where values from it are used in every block, they can instead take advantage of this in-memory storage.

@xlc
Copy link
Contributor

xlc commented Oct 4, 2022

This helps with #278 and also have many other use cases. Can we have this prioritized.

@cheme
Copy link
Contributor

cheme commented Oct 4, 2022

In itself transient storage do not look like a lot of work to me (I just did draft it here paritytech/substrate@master...cheme:transient_storage), but it did add some host function and can make a few thing a bit more complex (I remember a discussion about making proof of execution for individual extrinsic where we could call storage_root between extrinsic to allow it: this would not be possible anymore). Also it open some question about limiting memory usage.

@bkchr
Copy link
Member

bkchr commented Oct 4, 2022

this would not be possible anymore

Why?

Also it open some question about limiting memory usage.

In what way?

@cheme
Copy link
Contributor

cheme commented Oct 4, 2022

It would be possible but you would need to either attach the transient storage state at start of extrinsic or a root of it, but anyway I don't think it is a use case we want to support (proof of execution of a single extrinsic of a block).

I just fear that the host function would be use to store big blobs, but that was a bit silly (a runtime should not allow doing so).

@burdges
Copy link

burdges commented Oct 4, 2022

We'll want other storage options with other proving semantics ala KZG, so I'm worried if merely doing a hash map is a problem. It's also simpler if you know some storage type simply goes away, but..

If you want a transparent option, then you could mark state writes as transient in the block, and then declare the block invalid if not removed when the block concludes. In this way, the block producer could add these transient markers automatically.

@pepyakin
Copy link
Contributor

pepyakin commented Oct 4, 2022

This helps with #278 and also have many other use cases. Can we have this prioritized.

Since then, we also had some discussions which may be useful to contextualize this a bit.

  1. The discussion about custom storage tries. I am not able to find the issue to link, but the general idea is to give the Substrate Runtime ability to manipulate data structures other than the vanilla key-value storage available right now. Such a structure would be similar to the existing child-tries (and in fact, CTs were designed to accommodate different types of tries already), but the allowed data patterns or rules will be different. Besides different tree types, one of the use cases was transient data structures. Another would be append-only data structures. Each type of the trie would be manipulated by a specifically designed API.
  2. There is another angle that storage tries could be attacked: Block building within the same wasm memory? substrate#10557. Right now, during block building, we assume that each call/extrinsic has a pristine memory space untouched by prior runtime calls. If we lift that constraint, it will be possible to have a similar construction to transient storage described here, but not equivalent. Still worth keeping this option in mind.

@bkchr
Copy link
Member

bkchr commented Oct 4, 2022

It would be possible but you would need to either attach the transient storage state at start of extrinsic or a root of it, but anyway I don't think it is a use case we want to support (proof of execution of a single extrinsic of a block).

I just fear that the host function would be use to store big blobs, but that was a bit silly (a runtime should not allow doing so).

As these elements are not part of the trie, do we really need to add them to the storage root? I don't think so.

@cheme
Copy link
Contributor

cheme commented Oct 5, 2022

It would be possible but you would need to either attach the transient storage state at start of extrinsic or a root of it, but anyway I don't think it is a use case we want to support (proof of execution of a single extrinsic of a block).
I just fear that the host function would be use to store big blobs, but that was a bit silly (a runtime should not allow doing so).

As these elements are not part of the trie, do we really need to add them to the storage root? I don't think so.

As part of this issue of course not, as part of the use case where a modified system runtime would call and store storage_root between each extrinsic (to create proof for single extrinsic), it may indeed not be needed and in case it is (let's say to harden stuff or for legal reason) the transient storage could still be stored to storage before intermediary storage_root calls (as long as the transient storage provide an iterator).

@bkchr
Copy link
Member

bkchr commented Oct 5, 2022

You should only store information in transient storage that are not important for the state, IMO. Aka you would also not need it for any intermediate proof. Data that isn't being able to be accessed later, doesn't need to be part of the storage root.

@arkpar
Copy link
Member

arkpar commented Oct 5, 2022

Related: paritytech/substrate#9170

A relatively simple solution that I suggested there is to add a revert function. The actual trie lookup is happening when the transient values are deleted from storage. If there was a function that deleted them only from the memory overlay, that would be enough to prevent the trie lookup.

@bkchr
Copy link
Member

bkchr commented Nov 8, 2022

Something semi related would be to have some kind of storage item that you can write in a block, but reading it in a block would always return the old value. Currently such a behavior could be achieved by having some intermediate storage item that you "move" in on_finalize to the correct storage item. This could be used in Aura for example for the authorities. Currently when there is a new session we directly overwrite the authorities. This leads to things like FindAuthor returning the author based on the new set (which is clearly wrong).

@cheme
Copy link
Contributor

cheme commented Nov 8, 2022

Something semi related would be to have some kind of storage item that you can write in a block, but reading it in a block would always return the old value

I did implement it in the past (as a way to avoid lock in an experimental thread branch), it is pretty straightforward to do (just don't query the change overlay in state-machine).

@bkchr
Copy link
Member

bkchr commented Nov 8, 2022

Yeah, it shouldn't be too hard to implement. Just will require some new host function.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/generalized-storage-proofs/1315/5

@shawntabrizi
Copy link
Member

shawntabrizi commented Jun 16, 2023

Just FYI this is one of the features Cosmos SDK already provides to their users:

https://docs.cosmos.network/v0.46/core/store.html#transient-store

@burdges
Copy link

burdges commented Jun 16, 2023

Anyone know if statics already just work in substrate?

@bkchr
Copy link
Member

bkchr commented Jun 20, 2023

Anyone know if statics already just work in substrate?

No.

@JoshOrndorff
Copy link
Contributor

Doesn't the storage overlay optimization prevent things that are stored and also removed before the end of the block from going to the db anyway?

@burdges
Copy link

burdges commented Aug 15, 2023

A PoV block needs the copath elements in the PoV data to prove there was nothing there already.

It obviously makes no sense to use storage for ephemeral stuff, instead execute_block should be passing some &mut T between the components it calls.

@bkchr
Copy link
Member

bkchr commented Aug 16, 2023

Doesn't the storage overlay optimization prevent things that are stored and also removed before the end of the block from going to the db anyway?

The point is that you should not be required to clear this data at the end and it is thrown away automatically for you.

@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added T1-FRAME This PR/Issue is related to core FRAME, the framework. D1-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. and removed T1-runtime labels Aug 25, 2023
jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024
* relay-ethereum-client

* use relay-ethereum-client from ethereum-poa-relay

* cargo fmt --all

* #![warn(missing_docs)]

* EthereumRpcClient -> EthereumClient

* make EthereumHeadersSyncPipeline private

* return concrete type from crate::new

* cleanup dependencies

* *self -> self

* remove trait Client

* sort deps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
D1-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Backlog
Development

No branches or pull requests