Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Append overlay refactor proposal #13940

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open

Append overlay refactor proposal #13940

wants to merge 40 commits into from

Conversation

cheme
Copy link
Contributor

@cheme cheme commented Apr 18, 2023

This branch propose to avoid clones in append by storing offset and size in previous overlay depth.
That way on rollback we can just truncate and change size of existing value.
To avoid copy it also means that :

  • append on new overlay layer if there is an existing value: create a new Append entry with previous offsets, and take memory of previous overlay value.
  • rollback on append: restore value by applying offsets and put it back in previous overlay value
  • commit on append: appended value overwrite previous value (is an empty vec as the memory was taken). offsets of commited layer are dropped, if there is offset in previous overlay layer they are maintained.
  • set value (or remove) when append offsets are present: current appended value is moved back to previous overlay value with offset applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector).

@cheme cheme added the A3-in_progress Pull request is in progress. No review needed at this stage. label Apr 25, 2023
@cheme
Copy link
Contributor Author

cheme commented Apr 25, 2023

Please don't review yet (will change design a bit to be closest to paritytech/polkadot-sdk#30)

cheme added 3 commits April 25, 2023 15:38
- append always set an append item
- move data only between two consecutive append
- store size and offset at current depth
@ggwpez
Copy link
Member

ggwpez commented Apr 25, 2023

Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector).

So this applies to all appendable storage structus?
Not sure if it is worth it, but could we selectively add this with to a StorageValue with an append_only attribute?
AFAIK we only need this fast truncate on append-only vectors like System::Events, but the advantage is that they are never read.

Otherwise we should probably add a test to check that this solution has a low memory footprint on deeply nested reverts (as expected).

@cheme
Copy link
Contributor Author

cheme commented Apr 26, 2023

So this applies to all appendable storage structus?

yes all storage item containing a value. But as I did comment a bit in the code, using a same key value with sp_io::set and later with sp_io::append is really bad and no sensible runtime should allow it. (if you got a value that is encoded Vec of length 3 so starting with a compact, call to append with a 4 byte encoded vec will change the current byte length to 4 and append the content, then reading will get nothing sensible.

Not sure if it is worth it, but could we selectively add this with to a StorageValue with an append_only attribute?

So yes in practice it could be in a different key space, but that is basically what transient storage does (need to copy final value in some finalize to a standard state value though).
Alternatively (or extending the transient storage) a child trie/state could be exclusive to append value.

AFAIK we only need this fast truncate on append-only vectors like System::Events, but the advantage is that they are never read.

yes, but we cannot ensure it. In my next change, the storage will make this usecase better (with paritytech/polkadot-sdk#30, if only writes followed by a single read (root calculation), then we will not have anymore the costy resize each time the number of element Compact change in size (or a single one)).

@cheme cheme requested a review from koute as a code owner April 26, 2023 14:05
@cheme cheme added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 26, 2023
@cheme
Copy link
Contributor Author

cheme commented Apr 26, 2023

I think I did align with paritytech/polkadot-sdk#30 , should be reviewable (requires more test, especially since most test access on every transaction change and thus render the data).

@cheme cheme requested a review from a team May 16, 2023 15:41
@cheme
Copy link
Contributor Author

cheme commented May 16, 2023

If this works out, then we maybe dont need #14120

I really think storing big value is a mistake so at some point paged list really make sense.
(ability to produce way smaller proofs at small cost)

@ggwpez
Copy link
Member

ggwpez commented May 16, 2023

If this works out, then we maybe dont need #14120

I really think storing big value is a mistake so at some point paged list really make sense. (ability to produce way smaller proofs at small cost)

Ah right. So it depends on the use-case. For System::Events, we can just use your MR. But for para-chain cases we maybe rather want to use the paged list.

@cheme
Copy link
Contributor Author

cheme commented May 16, 2023

For events we should use transient storage (only write root of an event trie in state and still index data in an external db), but this is probably not here soon, so yes short term this is more pragmatic.

@cheme cheme added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited labels May 17, 2023
@cheme
Copy link
Contributor Author

cheme commented May 17, 2023

I went through a bit of fuzzing and fix a few issue, think this would be good to review now.

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable-int
Logs: https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/2850187

@kianenigma kianenigma self-requested a review June 19, 2023 12:27
@cheme cheme requested a review from a team July 13, 2023 07:15
@cheme cheme requested a review from andresilva as a code owner July 31, 2023 09:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited
Projects
Development

Successfully merging this pull request may close these issues.

3 participants