Improve performance of `storage::append` #30

bkchr · 2022-11-02T13:12:10Z

storage::append currently works by appending the new data directly to the old data. Instead it would be better if keep appending the raw data to some Vec<u8> and keep the number of items in a separate field:

struct AppendedValue {
    num: Compact<u32>,
    data: Vec<u8>,
    materialized: Option<Vec<u8>>,
}

materialized would be used when the runtime requests the data to cache the materialized "view". When we append new data, we would reset materialized.

This new data structure would also be more optimized for storage transactions as we only need to truncate the data vector on discarding a transaction.

The text was updated successfully, but these errors were encountered:

burdges · 2022-11-03T17:42:17Z

I've not checked any details, but I wonder if the "tree-hashing" in blake2 gives efficient append for Vec<u8>. We might not use blake2's tree-hashing mode now, but..

In fact, you could typically "back out" the "finalization" call to make appends efficient for any hash function, and maybe blake2 also requires doing this. I suppose blake2's "tree-hashing" would make partial reads efficient, but not really help partial writes.

MrishoLukamba · 2022-11-03T17:49:57Z

i would like to tackle it and get some guidance.

bkchr · 2022-11-03T21:03:28Z

i would like to tackle it and get some guidance.

What guidance? If you want to work on this, you should start looking into the code and coming up with a way to implement this.

MrishoLukamba · 2022-11-03T21:55:42Z

i would like to tackle it and get some guidance.

What guidance? If you want to work on this, you should start looking into the code and coming up with a way to implement this.

Okey

bkchr · 2022-11-03T22:13:55Z

I have layed out the basic idea in the initial issue description. I would recommend as I said to look into the code first. I didn't spend that much more time thinking about the actual implementation. However, this is also part of the job when wanting to tackle this. Any kind of questions you have, you can post here.

burdges · 2022-11-03T22:45:41Z

There is one one compress call in finalization, so an append proof could be an internal blake2b state, but this does not really jive with the hash function interface, so technically such a use case requires re-analysis. ugh..

bkchr · 2022-11-03T22:47:42Z

There is one one compress call in finalization, so an append proof could be an internal blake2b state, but this does not really jive with the hash function interface, so technically such a use case requires re-analysis. ugh..

I can not really follow what you are talking about. However, this issue isn't about hashing.

MrishoLukamba · 2022-11-03T23:20:57Z

There is one one compress call in finalization, so an append proof could be an internal blake2b state, but this does not really jive with the hash function interface, so technically such a use case requires re-analysis. ugh..

I can not really follow what you are talking about. However, this issue isn't about hashing.

Me too. I will raise the PR and iterate

burdges · 2022-11-03T23:49:43Z

I'm saying if all you want is a more efficient proof of appending to a hash function, to make PoV blobs smaller, then we could actually do that pretty nicely. It requires some further analysis of the has function though, so not really solvable right here. If you do care a lot about then, then ask me to find someone to work it out though.

We can move this to the research channel.

MrishoLukamba · 2022-11-11T22:14:45Z

I have layed out the basic idea in the initial issue description. I would recommend as I said to look into the code first. I didn't spend that much more time thinking about the actual implementation. However, this is also part of the job when wanting to tackle this. Any kind of questions you have, you can post here.

So after reading and familiarizing myself with OverlayedChanges , this issue affects almost every piece of code inside sp-state-machine. Because it only work with Vec<u8> and now we must change it to work with AppendedValue. Because its not only affecting appending functionalities but also retrieving.

So 1. Can I tackle this issue step by step and not wait untill I finish it.
2. And on what I just stated above , do you think am on the right track because I have also to change retrieving side

bkchr · 2022-11-12T01:21:13Z

Can I tackle this issue step by step and not wait untill I finish it.

I don't get this question.

And on what I just stated above , do you think am on the right track because I have also to change retrieving side

Not sure what you mean by the "retrieving side". In the end we will still output a Vec<u8>.

tomaka · 2022-11-14T11:12:03Z

it would be better

Maybe a random remark, but how are you sure that this is the case?

Right now, if I understand correctly, you append the new item to the storage value and update the number of elements at the start. This means that, most of the time, the operation will be very inexpensive (memcpying the new element), except when:

The Vec needs a relocation.
The number of elements goes from 63 to 64, or from 16383 to 16384, or from 1073741823 to more.

...in which case the entire existing data plus the new element are memcpied.

On the other hand, with what you suggest:

The problem with Vec needing a relocation still exists (the data field).
The entire existing data plus the new element are memcpied if they are accessed after having been appended.

So, to me, the only situation where that new solution would be better is if either you never ever read the data (in which case why is there a storage item in the first place?), or if you add more than 16384 elements in a row before reading and then never ever append to it again.
And if, for example, computing the hash of the storage item requires building materialized then you're worse every single time.

bkchr · 2022-11-14T20:32:48Z

The problem with Vec needing a relocation still exists (the data field).

For every append you would not need to relocate the existing data, because the vec will allocate more data (assuming you don't use reserve_exact) to push the newly appended data.

The most important improvement here would be around storage transactions. Currently storage transactions require that you clone the entire data, while here we could prevent cloning the entire data. We would store only the old number of elements + the old length. Then on discarding a transaction it would be a simple truncate call.

MrishoLukamba · 2022-11-14T22:49:42Z

The problem with Vec needing a relocation still exists (the data field).

For every append you would not need to relocate the existing data, because the vec will allocate more data (assuming you don't use reserve_exact) to push the newly appended data.

The most important improvement here would be around storage transactions. Currently storage transactions require that you clone the entire data, while here we could prevent cloning the entire data. We would store only the old number of elements + the old length. Then on discarding a transaction it would be a simple truncate call.

	pub fn modify(
		&mut self,
		key: StorageKey,
		init: impl Fn() -> StorageValue,
		at_extrinsic: Option<u32>,
	) -> &mut Option<StorageValue> {
		let overlayed = self.changes.entry(key.clone()).or_default();
		let first_write_in_tx = insert_dirty(&mut self.dirty_keys, key);
		let clone_into_new_tx = if let Some(tx) = overlayed.transactions.last() {
			if first_write_in_tx {
				Some(tx.value.clone())
			} else {
				None
			}
		} else {
			Some(Some(init()))
		};

		if let Some(cloned) = clone_into_new_tx {
			overlayed.set(cloned, first_write_in_tx, at_extrinsic);
		}
		overlayed.value_mut()
	}

MrishoLukamba · 2022-11-14T23:02:03Z

The problem with Vec needing a relocation still exists (the data field).

For every append you would not need to relocate the existing data, because the vec will allocate more data (assuming you don't use reserve_exact) to push the newly appended data.
The most important improvement here would be around storage transactions. Currently storage transactions require that you clone the entire data, while here we could prevent cloning the entire data. We would store only the old number of elements + the old length. Then on discarding a transaction it would be a simple truncate call.
	pub fn modify(
		&mut self,
		key: StorageKey,
		init: impl Fn() -> StorageValue,
		at_extrinsic: Option<u32>,
	) -> &mut Option<StorageValue> {
		let overlayed = self.changes.entry(key.clone()).or_default();
		let first_write_in_tx = insert_dirty(&mut self.dirty_keys, key);
		let clone_into_new_tx = if let Some(tx) = overlayed.transactions.last() {
			if first_write_in_tx {
				Some(tx.value.clone())
			} else {
				None
			}
		} else {
			Some(Some(init()))
		};

		if let Some(cloned) = clone_into_new_tx {
			overlayed.set(cloned, first_write_in_tx, at_extrinsic);
		}
		overlayed.value_mut()
	}

this snippet function is actually appending data , and yes it is cloning the old data. What I was thinking is because the overlay storage item is a vector of InnerValue which is struct { value: Vec<u8>, extrinsic; u32 } . If I modify
the overlay storage item to be AppendedValue I think I will be getting somewhere on solving this and also I will have to modify how child storage work also.

Please any feedback on my thinking

bkchr · 2022-11-15T09:44:16Z

Sorry, but I don't get what you have written above.

cheme · 2022-11-16T08:02:42Z

I am a bit late in this conversation, but in one of my branch (https://github.com/cheme/substrate/tree/threads) I got a similar system:

overlay store changes: https://github.com/cheme/substrate/blob/b0f194bf319472dd4ffe1211028c02bd5cdf3a77/primitives/externalities/src/lib.rs#L382 these can be successive calls to append.
on append, if overlay is empty we just add a Change::Append entry in overlay, if overlay is already a value we append
on read of a value containing append we resolve from all previous append https://github.com/cheme/substrate/blob/b0f194bf319472dd4ffe1211028c02bd5cdf3a77/primitives/state-machine/src/overlayed_changes/mod.rs#L1315 and write a standard value instead.
A new host function function and externality function append_storage was added (needed). Same for child trie but no other change needed for it.
So the system do not reach memory efficiency if there is a lot of read, but as long as only write are done it is ok.
(the point of this branch was to allow parallel writes not really optimize foot print).

cheme · 2022-11-16T08:04:30Z

Another thing that was needed (but is often convenient) was to switch readonly externalities to &mut eg https://github.com/cheme/substrate/blob/b0f194bf319472dd4ffe1211028c02bd5cdf3a77/primitives/externalities/src/lib.rs#L76

* adds static `db_config.max_total_wal_size`, pinned to `40 GB` to match current documentation @ https://github.com/PolymeshAssociation/polymesh-tools/blob/7d8146deb459ffca3de6cd5da270928a529f233c/docs/operator/README.md#node-resource-requirements : `It is recommended that you reserve an additional 40GB of disk space for the WAL.` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` Co-authored-by: Quinn Diggity <git@quinn.dev> Max wal size (paritytech#30) * Use `Option<u64>` for max_total_wal_size. Signed-off-by: Robert G. Jakabosky <rjakabosky+neopallium@neoawareness.com>

Bumps [ctrlc](https://github.com/Detegr/rust-ctrlc) from 3.1.3 to 3.1.4. - [Release notes](https://github.com/Detegr/rust-ctrlc/releases) - [Commits](https://github.com/Detegr/rust-ctrlc/commits/3.1.4) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

This branch propose to avoid clones in append by storing offset and size in previous overlay depth. That way on rollback we can just truncate and change size of existing value. To avoid copy it also means that : - append on new overlay layer if there is an existing value: create a new Append entry with previous offsets, and take memory of previous overlay value. - rollback on append: restore value by applying offsets and put it back in previous overlay value - commit on append: appended value overwrite previous value (is an empty vec as the memory was taken). offsets of commited layer are dropped, if there is offset in previous overlay layer they are maintained. - set value (or remove) when append offsets are present: current appended value is moved back to previous overlay value with offset applied and current empty entry is overwrite (no offsets kept). The modify mechanism is not needed anymore. This branch lacks testing and break some existing genericity (bit of duplicated code), but good to have to check direction. Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector). Fix #30. --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: EgorPopelyaev <egor@parity.io> Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: joe petrowski <25483142+joepetrowski@users.noreply.github.com> Co-authored-by: Liam Aharon <liam.aharon@hotmail.com> Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> Co-authored-by: Branislav Kontur <bkontur@gmail.com> Co-authored-by: Bastian Köcher <info@kchr.de> Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

This branch propose to avoid clones in append by storing offset and size in previous overlay depth. That way on rollback we can just truncate and change size of existing value. To avoid copy it also means that : - append on new overlay layer if there is an existing value: create a new Append entry with previous offsets, and take memory of previous overlay value. - rollback on append: restore value by applying offsets and put it back in previous overlay value - commit on append: appended value overwrite previous value (is an empty vec as the memory was taken). offsets of commited layer are dropped, if there is offset in previous overlay layer they are maintained. - set value (or remove) when append offsets are present: current appended value is moved back to previous overlay value with offset applied and current empty entry is overwrite (no offsets kept). The modify mechanism is not needed anymore. This branch lacks testing and break some existing genericity (bit of duplicated code), but good to have to check direction. Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector). Fix paritytech#30. --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: EgorPopelyaev <egor@parity.io> Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: joe petrowski <25483142+joepetrowski@users.noreply.github.com> Co-authored-by: Liam Aharon <liam.aharon@hotmail.com> Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> Co-authored-by: Branislav Kontur <bkontur@gmail.com> Co-authored-by: Bastian Köcher <info@kchr.de> Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

…aritytech#30) * Add prometheus config to run cmd and start substrate network * Sync seednode list from Bitcoin Core * Update openssl dep * Make Substrate full sync work * Restore the default --blocks-pruning * Fix clippy

* adds static `db_config.max_total_wal_size`, pinned to `40 GB` to match current documentation @ https://github.com/PolymeshAssociation/polymesh-tools/blob/7d8146deb459ffca3de6cd5da270928a529f233c/docs/operator/README.md#node-resource-requirements : `It is recommended that you reserve an additional 40GB of disk space for the WAL.` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` * adds cli flag `--db-max-total-wal-size` to allow specifying `db_config.max_total_wal_size` Co-authored-by: Quinn Diggity <git@quinn.dev> Max wal size (paritytech#30) * Use `Option<u64>` for max_total_wal_size. Signed-off-by: Robert G. Jakabosky <rjakabosky+neopallium@neoawareness.com>

This branch propose to avoid clones in append by storing offset and size in previous overlay depth. That way on rollback we can just truncate and change size of existing value. To avoid copy it also means that : - append on new overlay layer if there is an existing value: create a new Append entry with previous offsets, and take memory of previous overlay value. - rollback on append: restore value by applying offsets and put it back in previous overlay value - commit on append: appended value overwrite previous value (is an empty vec as the memory was taken). offsets of commited layer are dropped, if there is offset in previous overlay layer they are maintained. - set value (or remove) when append offsets are present: current appended value is moved back to previous overlay value with offset applied and current empty entry is overwrite (no offsets kept). The modify mechanism is not needed anymore. This branch lacks testing and break some existing genericity (bit of duplicated code), but good to have to check direction. Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector). Fix paritytech#30. --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: EgorPopelyaev <egor@parity.io> Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: Bastian Köcher <git@kchr.de> Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: joe petrowski <25483142+joepetrowski@users.noreply.github.com> Co-authored-by: Liam Aharon <liam.aharon@hotmail.com> Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> Co-authored-by: Branislav Kontur <bkontur@gmail.com> Co-authored-by: Bastian Köcher <info@kchr.de> Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

bkchr added the I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. label Nov 2, 2022

MrishoLukamba mentioned this issue Nov 7, 2022

First iteration of Storage append paritytech/substrate#12637

Closed

cheme mentioned this issue Feb 2, 2023

Child trie and state machine refactors paritytech/substrate#13006

Open

cheme mentioned this issue Apr 25, 2023

Append overlay refactor proposal paritytech/substrate#13940

Open

the-right-joyce transferred this issue from paritytech/substrate Aug 24, 2023

cheme mentioned this issue Aug 28, 2023

Append overlay optimization. #1223

Merged

bkchr closed this as completed in #1223 Jun 11, 2024

github-project-automation bot moved this from backlog to done in SDK Node Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of `storage::append` #30

Improve performance of `storage::append` #30

bkchr commented Nov 2, 2022

burdges commented Nov 3, 2022 •

edited

Loading

MrishoLukamba commented Nov 3, 2022

bkchr commented Nov 3, 2022

MrishoLukamba commented Nov 3, 2022

bkchr commented Nov 3, 2022

burdges commented Nov 3, 2022

bkchr commented Nov 3, 2022

MrishoLukamba commented Nov 3, 2022

burdges commented Nov 3, 2022 •

edited

Loading

MrishoLukamba commented Nov 11, 2022

bkchr commented Nov 12, 2022

tomaka commented Nov 14, 2022 •

edited

Loading

bkchr commented Nov 14, 2022

MrishoLukamba commented Nov 14, 2022

MrishoLukamba commented Nov 14, 2022

bkchr commented Nov 15, 2022

cheme commented Nov 16, 2022

cheme commented Nov 16, 2022

Improve performance of storage::append #30

Improve performance of storage::append #30

Comments

bkchr commented Nov 2, 2022

burdges commented Nov 3, 2022 • edited Loading

MrishoLukamba commented Nov 3, 2022

bkchr commented Nov 3, 2022

MrishoLukamba commented Nov 3, 2022

bkchr commented Nov 3, 2022

burdges commented Nov 3, 2022

bkchr commented Nov 3, 2022

MrishoLukamba commented Nov 3, 2022

burdges commented Nov 3, 2022 • edited Loading

MrishoLukamba commented Nov 11, 2022

bkchr commented Nov 12, 2022

tomaka commented Nov 14, 2022 • edited Loading

bkchr commented Nov 14, 2022

MrishoLukamba commented Nov 14, 2022

MrishoLukamba commented Nov 14, 2022

bkchr commented Nov 15, 2022

cheme commented Nov 16, 2022

cheme commented Nov 16, 2022

Improve performance of `storage::append` #30

Improve performance of `storage::append` #30

burdges commented Nov 3, 2022 •

edited

Loading

burdges commented Nov 3, 2022 •

edited

Loading

tomaka commented Nov 14, 2022 •

edited

Loading