-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Storage chains: indexing, renewals and reference counting #8265
Conversation
client/light/src/blockchain.rs
Outdated
@@ -129,10 +129,10 @@ impl<S, Block> BlockchainBackend<Block> for Blockchain<S> where Block: BlockT, S | |||
Err(ClientError::NotAvailableOnLightClient) | |||
} | |||
|
|||
fn extrinsic( | |||
fn transaction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this to transaction
since it now returns an indexed portion, rather than all of the extrinsic data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New externalities feels a lot like offchain indexing here (do trigger some optional behavior from runtime with no feedback).
So I wonder if this could be achievable through offchain worker consuming offchain indexing data containing those indexing operations (it will need to do re-indexing in a second asynchronous step which can be a bit concurrent with pruning but seems manageable).
Thinking twice attaching the block hash with the indexing operation in the offchain db may not be trivial, just mentioning the possibility.
May also be a bit tricky to decode 'extrinsic_headers' then (two different encoding to manage).
client/db/src/lib.rs
Outdated
DbHash::from_slice(hash.as_ref()), | ||
Some(extrinsic[offset..].to_vec()) | ||
); | ||
extrinsic_headers.push((DbHash::from_slice(&hash.as_ref()), extrinsic[..offset].to_vec())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extrinsic_headers is mixing extrinsic using applied offset with full extrinsic from renewed, is it ok?
Generally I am not sure how 'renewed' is supposed to use its 'size' field.
Also wondering how/if 'extrinsic[..offset]' can be decoded?
The runtime can certainly find the right offset to split transaction between two field, then when decoding one should use the split variant, yes probably work this way (but then 'renewed' also use this variant and should write the split version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extrinsic_headers is mixing extrinsic using applied offset with full extrinsic from renewed, is it ok?
Yes, all extrinsics that don't index anything, including renewals, are just written as header.
Also wondering how/if 'extrinsic[..offset]' can be decoded?
It is never decoded on its own. Only full extrinsic (header + indexed portion) is decoded. The indexed region is not supposed to be scale-encoded. It's just some bytes that are interpreted by the user.
Generally I am not sure how 'renewed' is supposed to use its 'size' field.
size
field is a bit of work in progress. The idea is that the renewal transaction should include fee that is proportional to the existing data size. Here, in commit we would check existing expected indexed data size and don't renew anything if it does not match. However it is not clear yet if actually checking the size in the runtime would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(for whatever reason I was thinking renewed did contain the indexed content, and did need to be strip, but obviously it doesn't).
I do not mark the conversation as resolved because the reply looks useful to me, but it is resolved.
Co-authored-by: cheme <emericchevalier.pro@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I was a bit hesitant about the new extrinsics, I think there may be something todo with offchain workers to avoid the new extrinsics, but I am also pretty sure it involves substantial additional work.
But this can always switch or change later (as long as they are not call in a production chain (IIRC those are not exposed in the wasm as long as it is not used by the wasm)).
bot merge force |
Trying merge. |
Merge failed: |
primitives/state-machine/src/ext.rs
Outdated
@@ -568,6 +568,43 @@ where | |||
} | |||
} | |||
|
|||
fn storage_index_transaction(&mut self, index: u32, offset: u32) -> Result<(), ()>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we return a Result
here, when this can not fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
client/db/src/lib.rs
Outdated
if let Some(mut t) = self.indexed_transaction(&hash)? { | ||
data.append(&mut t); | ||
} | ||
Block::Extrinsic::decode(&mut data.as_slice()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we would use a custom Input
here, we could prevent calling append
above and allocating a third time.
This custom input would just combine data
and t
client/db/src/lib.rs
Outdated
|h| self.extrinsic(&h).and_then(|maybe_ex| maybe_ex.ok_or_else( | ||
|| sp_blockchain::Error::Backend( | ||
format!("Missing transaction: {}", h)))) | ||
match Vec::<(Block::Hash, Vec<u8>)>::decode(&mut &body[..]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This body structure is really implicitly used here. Maybe we could use at least some typedef.
client/db/src/lib.rs
Outdated
} | ||
} | ||
// Also discard all blocks from displaced branches | ||
for h in displaced.leaves() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means before we always kept all blocks, even from forks, but now we will prune all fork blocks? Even when block pruning is not enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
client/db/src/lib.rs
Outdated
for h in displaced.leaves() { | ||
let mut number = finalized; | ||
let mut hash = h.clone(); | ||
// Follow displaced chains up to finality point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we start at the finalized point and move downwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. We start with a displaced leaf and follow its parents until we reach a block that is canonical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I understood it this way.
Shouldn't the comment say Follow displaced chains up to canonical point
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they are displaced due to finality, as soon as we meet a canonical parent it is guaranteed to be a part of a finalized chain. I'll update the comment to make it more obvious.
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
@bkchr All the issues addressed now I believe. Companion build failure seems unrelated. |
client/db/src/utils.rs
Outdated
self.0.read(&mut into[..read])?; | ||
} | ||
None => { | ||
return self.0.read(into) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means we will never read self.1
. For our use case here it is okay, because we always should get Some()
, but that should really not be used by someone else.
Maybe we should use unreachable
here or make JoinInput
directly use Vec<u8>
and Option<Vec<u8>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to operate on slices instead
bot merge |
Trying merge. |
@@ -103,12 +103,35 @@ pub struct OverlayedChanges { | |||
children: Map<StorageKey, (OverlayedChangeSet, ChildInfo)>, | |||
/// Offchain related changes. | |||
offchain: OffchainOverlayedChanges, | |||
/// Transaction index changes, | |||
transaction_index_ops: Vec<IndexOperation>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized while merging master in some PRs, but 'transaction_index_ops' could also support transaction (Same implementation as 'OffchainOverlayedChanges' for offchain indexing).
Without thinking too much about it, I would think it should (would allow conditional indexing).
cc\ @arkpar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not bother with that for now. As I understand, transactional semantics are used here to support rollback on failing extrinsics. Indexing is transparent to the runtime anyway. If an extrinsic fails, that fact that it still indexes something does not affect consensus.
Furthermore:
- Storage transactions should not be adding to the index until they are pass the failure point. I.e. all fees are applied.
- Unlike regular transactions, failing storage transactions should not be included in the block in the first place. Storing them for free would defeat the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its a priority.
I agree consensus is not touched and client can even choose to not index anything at all.
That was the same thing with offchain indexing. Can be implemented at any time.
(I was talking about storage transaction, it is used for different purpose, mainly in contracts afaik)
…#8265) * Transaction indexing * Tests and fixes * Fixed a comment * Style * Build * Style * Apply suggestions from code review Co-authored-by: cheme <emericchevalier.pro@gmail.com> * Code review suggestions * Add missing impl * Apply suggestions from code review Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com> * impl JoinInput * Don't store empty slices * JoinInput operates on slices Co-authored-by: cheme <emericchevalier.pro@gmail.com> Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
…#8265) * Transaction indexing * Tests and fixes * Fixed a comment * Style * Build * Style * Apply suggestions from code review Co-authored-by: cheme <emericchevalier.pro@gmail.com> * Code review suggestions * Add missing impl * Apply suggestions from code review Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com> * impl JoinInput * Don't store empty slices * JoinInput operates on slices Co-authored-by: cheme <emericchevalier.pro@gmail.com> Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
See #7962
This adds externalities for indexing and renewing transactions from within the runtime.
Runtime may now mark a specific region within the encoded extrinsic data to be indexed and addressable by hash with
storage_index_transaction
externality. Another functionstorage_renew_transaction_index
renews a previously indexed transaction and prevents it from being pruned.Reference counting for indexed data is implemented for parity-db natively and for rocksdb by storing a counter in a separate DB value.
Following PR will add a runtime module and a storage proof inherent.
polkadot companion: paritytech/polkadot#2570