chainHead: Report unique hashes for pruned blocks #3667

lexnv · 2024-03-12T16:48:49Z

This PR ensures that the reported pruned blocks are unique.

While at it, ensure that the best block event is properly generated when the last best block is a fork that will be pruned in the future.

To achieve this, the chainHead keeps a LRU set of reported pruned blocks to ensure the following are not reported twice:

	 finalized -> block 1 -> block 2 -> block 3
	
	                      -> block 2 -> block 4 -> block 5
	
	           -> block 1 -> block 2_f -> block 6 -> block 7 -> block 8

When block 7 is finalized the branch [block 2; block 3] is reported as pruned.
When block 8 is finalized the branch [block 2; block 4; block 5] should be reported as pruned, however block 2 was already reported as pruned at the previous step.

This is a side-effect of the pruned blocks being reported at level N - 1. For example, if all pruned forks would be reported with the first encounter (when block 6 is finalized we know that block 3 and block 5 are stale), we would not need the LRU cache.

cc @paritytech/subxt-team

Closes #3658

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

jsdw · 2024-03-13T15:24:54Z

I had assumed that ensuring pruned hashes were unique was as simple as using a HashSet or something when building the pruned list. That would ensure the blocks in a single pruned list were unique.

Am I right in my understanding that the extra bits in this PR are to also ensure that we don't see the same block hash mentioned again across different pruned lists, too?

lexnv · 2024-03-13T17:44:18Z

Yep, I initially thought we could get away with a simple HashSet.

However, we might prune a fork that contains another fork:

fork -> A
   |
     -> B -> C

Chain:
        X ... Y ... Z

In this case, when we finalize Y Substrate reports A as pruned. We'll go the path [fork .. A] and report those blocks.
When we finalize Z Substrate reports C as pruned. We'll go again on the path [fork .. B C]. We'd need to ignore the common blocks that we've just reported with [fork .. A].

skunert

Logic looks correct, only some nits in the comments 👍

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

jsdw · 2024-03-15T14:28:03Z

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

+				// Best block already generated.
+				if block_cache == last_finalized {
 					events.push(finalized_event);
+					return Ok(events);
+				}


To check my understanding, this is all abotu addressing this bit of the spec:

The current best block, in other words the last block reported through a bestBlockChanged event, is guaranteed to either be the last item in finalizedBlockHashes, or to not be present in either finalizedBlockHashes or prunedBlockHashes.

Here it looks like when a new last finalized block is seen, we:

Return happily if the best block was finalized.

Else, return a new BestBlockChanged event if the current best block has been pruned.

Else, return a new BestBlockChanged event if the current best block will be pruned (ie is not a descendant of the new finalized chain)

To satisfy the spec, could the whole thing be simplified to pseudocode like this?:

if let Some(current_best_block) = self.best_block_cache { // current best block is in pruned list, so need to emit new BestBlock let is_in_pruned_list = pruned_block_hashes.iter().any(|hash| *hash == current_best_block); // current best block is not the last one we finalized let is_not_last_finalized = current_best_block != last_finalized; if is_in_pruned_list || is_not_last_finalized { let new_best_block = self.client.info().best_hash; events.push(BestBlockChanged { best_block_hash }) } } events.push(finalized_event); Ok(events)

In any case, I wonder whether we could avoid some of the early returns and such, and separate it so that we first write the logic to decide whether we need to push a new BestBlockChanged event and then do the push, with a single return at the end to hand back the events?

Yep, that makes sense!

I think I was more worried about cases where we've already generated a BestBlockChanged event for a descendant for the finalized block.

However, I think this is safe to regenerate a BestBlockChanged with an older block number, as long as it is the last reported finalized, from this spec statement:

The current best block, in other words the last block reported through a bestBlockChanged event, is guaranteed to either be the last item in finalizedBlockHashes

The suggestion simplifies the code quite a bit, thanks! 🙏

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

…que-pruned-blocks Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

…que-pruned-blocks Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

paritytech-cicd-pr · 2024-04-09T13:12:21Z

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/5843121

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

skunert · 2024-04-10T16:49:58Z

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

-		self.submit_events(&startup_point, stream.boxed(), pruned_forks, sink, sub_data.rx_stop)
-			.await
+		// These are the pruned blocks that we should not report again.
+		for pruned in pruned_forks {


Just to be sure I understand this correctly. When we generate the initial events we consider all blocks that are part of a fork that started below the finalized one as pruned, even if they are technically not pruned yet. So these will never be reported?

Yep. I can't remember exactly, this was introduced quite a while ago to handle the case where the Finalized event would contain a pruned block not reported by a new block event.

This abuses the purpose of self.pruned_blocks, which now holds:

(previously stored in to_ignore) forks that we did not report previously by the Initialized event

(the intended purpose of this PR) pruned blocks already reported once

hashes Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

jsdw · 2024-04-15T14:56:56Z

substrate/client/rpc-spec-v2/src/chain_head/tests.rs

@@ -3644,3 +3644,364 @@ async fn chain_head_limit_reached() {
 	// Initialized must always be reported first.
 	let _event: FollowEvent<String> = get_next_event(&mut sub).await;
 }
+
+#[tokio::test]
+async fn follow_unique_pruned_blocks() {


Nit: This test is quite long, and looking at it, I wonder whether we can extract a couple of helper fns or something so that it's easier to see what's actually beign tested vs all of the staet we're creating.

Like maybe a couple of helpers like this would be handy, but not sure if there are differences tha would make it not-very-useful:

async fn import_block(client, parent_hash, parent_num) -> block { let block = BlockBuilderBuilder::new(&*client) .on_parent_block(client.chain_info().genesis_hash) .with_parent_block_number(0) .build() .unwrap() .build() .unwrap() .block; let block_1_hash = block_1.hash(); client.import(BlockOrigin::Own, block_1.clone()).await.unwrap(); block }

jsdw

This looks good to me; good job Alex!

…que-pruned-blocks

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

This PR ensures that the reported pruned blocks are unique. While at it, ensure that the best block event is properly generated when the last best block is a fork that will be pruned in the future. To achieve this, the chainHead keeps a LRU set of reported pruned blocks to ensure the following are not reported twice: ```bash finalized -> block 1 -> block 2 -> block 3 -> block 2 -> block 4 -> block 5 -> block 1 -> block 2_f -> block 6 -> block 7 -> block 8 ``` When block 7 is finalized the branch [block 2; block 3] is reported as pruned. When block 8 is finalized the branch [block 2; block 4; block 5] should be reported as pruned, however block 2 was already reported as pruned at the previous step. This is a side-effect of the pruned blocks being reported at level N - 1. For example, if all pruned forks would be reported with the first encounter (when block 6 is finalized we know that block 3 and block 5 are stale), we would not need the LRU cache. cc @paritytech/subxt-team Closes #3658 --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

This PR stabilizes the chainHead API to version 1. Needs: - #3667 cc @paritytech/subxt-team --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

lexnv added 7 commits March 12, 2024 15:08

chainHead: Report unique pruned hashes

d995348

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead: Test unique pruned blocks

00fd806

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead: Generate best block event even for unpruned forks

494a1ea

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead/tests: Add one more block to trigger pruning

10b143a

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead/tests: An extra block to validate pruned blocks are reported

c840c8b

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead: Use LRU for caching pruned blocks

3f3d876

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead/tests: Remove debug logs and add comment about test

936434c

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

lexnv added A1-insubstantial Pull request requires no code review (e.g., a sub-repository hash update). I2-bug The node fails to follow expected behavior. D1-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. labels Mar 12, 2024

lexnv self-assigned this Mar 12, 2024

lexnv added the R0-silent Changes should not be mentioned in any release notes label Mar 12, 2024

lexnv mentioned this pull request Mar 12, 2024

chainHead: Clarify reported order of pruned blocks paritytech/json-rpc-interface-spec#143

Closed

lexnv requested review from skunert and davxy March 13, 2024 11:29

skunert reviewed Mar 15, 2024

View reviewed changes

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs Outdated Show resolved Hide resolved

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs Outdated Show resolved Hide resolved

Update substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

a33ed67

Co-authored-by: Sebastian Kunert <skunert49@gmail.com>

lexnv commented Mar 15, 2024

View reviewed changes

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs Outdated Show resolved Hide resolved

Update substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs

2c05bc5

jsdw reviewed Mar 15, 2024

View reviewed changes

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs Outdated Show resolved Hide resolved

jsdw reviewed Mar 15, 2024

View reviewed changes

substrate/client/rpc-spec-v2/src/chain_head/chain_head_follow.rs Outdated Show resolved Hide resolved

lexnv added 2 commits March 15, 2024 15:41

chainHead: Rename best_block_cache to current_best_block

384d22c

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead: Use num pinned blocks for maximum LRU cache size

0b7ae73

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

jsdw reviewed Mar 15, 2024

View reviewed changes

lexnv added 4 commits March 19, 2024 17:20

chainHead: Simplify new block generation logic

e644f0a

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

Merge remote-tracking branch 'origin/master' into lexnv/chainhead-uni…

43c13db

…que-pruned-blocks Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

chainHead/tests: Add max sub config for tests

99bf9c9

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

Merge remote-tracking branch 'origin/master' into lexnv/chainhead-uni…

1eeb5e7

…que-pruned-blocks Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

Fix clippy

c52d42d

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

lexnv requested review from skunert and jsdw April 9, 2024 14:43

skunert reviewed Apr 10, 2024

View reviewed changes

chainHead/follow: Simplify generate_init_events by storing the pruned

0acd861

hashes Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

jsdw reviewed Apr 15, 2024

View reviewed changes

jsdw approved these changes Apr 15, 2024

View reviewed changes

lexnv added 2 commits April 16, 2024 13:10

Merge remote-tracking branch 'origin/master' into lexnv/chainhead-uni…

c952a0b

…que-pruned-blocks

chainHead/tests: Util code to reduce testing scenarios

e4ccc03

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>

lexnv mentioned this pull request Apr 17, 2024

chainHead: Stabilize chainHead to version 1 #4168

Merged

skunert approved these changes Apr 17, 2024

View reviewed changes

lexnv added this pull request to the merge queue Apr 17, 2024

Merged via the queue into master with commit bfbf7f5 Apr 17, 2024
133 of 137 checks passed

lexnv deleted the lexnv/chainhead-unique-pruned-blocks branch April 17, 2024 15:58

josepot mentioned this pull request Jul 14, 2024

Document rpc node requirements polkadot-api/polkadot-api#565

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chainHead: Report unique hashes for pruned blocks #3667

chainHead: Report unique hashes for pruned blocks #3667

lexnv commented Mar 12, 2024 •

edited

Loading

jsdw commented Mar 13, 2024

lexnv commented Mar 13, 2024

skunert left a comment

jsdw Mar 15, 2024 •

edited

Loading

lexnv Mar 19, 2024

paritytech-cicd-pr commented Apr 9, 2024

skunert Apr 10, 2024

lexnv Apr 15, 2024

jsdw Apr 15, 2024 •

edited

Loading

jsdw left a comment

chainHead: Report unique hashes for pruned blocks #3667

chainHead: Report unique hashes for pruned blocks #3667

Conversation

lexnv commented Mar 12, 2024 • edited Loading

jsdw commented Mar 13, 2024

lexnv commented Mar 13, 2024

skunert left a comment

Choose a reason for hiding this comment

jsdw Mar 15, 2024 • edited Loading

Choose a reason for hiding this comment

lexnv Mar 19, 2024

Choose a reason for hiding this comment

paritytech-cicd-pr commented Apr 9, 2024

skunert Apr 10, 2024

Choose a reason for hiding this comment

lexnv Apr 15, 2024

Choose a reason for hiding this comment

jsdw Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

jsdw left a comment

Choose a reason for hiding this comment

lexnv commented Mar 12, 2024 •

edited

Loading

jsdw Mar 15, 2024 •

edited

Loading

jsdw Apr 15, 2024 •

edited

Loading