Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bin, engine, prune): spawn pruning task from the engine #3566

Merged
merged 56 commits into from
Jul 12, 2023

Conversation

shekhirin
Copy link
Collaborator

@shekhirin shekhirin commented Jul 3, 2023

Ref #3439

Flow is the following:

  1. BeaconConsensusEngine tries to spawn the pruner via EnginePruneController.
  2. EnginePruneController calls Pruner::is_pruning_needed with the tip block number to check if pruning is needed.
  3. If pruning is needed, EnginePruneController spawns the pruning task in a separate thread and sets the pruner_state to Running. Then returns the control to BeaconConsensusEngine.
  4. If BeaconConsensusEngine receives new FCUs during the pruning, it skips it as with the running pipeline.
  5. When pruning finishes, EnginePruneController sets its pruner_state back to Idle, allowing BeaconConsensusEngine to process any incoming FCUs again.

Excerpt from the logs showing that we correctly prioritize engine messages over pruning:

ubuntu@c3-large-x86-new-york-1:~$ cat ~/.local/share/reth/sepolia/reth.toml  | tail -n 2
[prune]
block_interval = 2
2023-07-05T15:43:24.310409Z  INFO on_new_payload{block_hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214 block_number=3830558 is_pipeline_idle=true}:try_insert_new_payload:try_insert_validated_block{block=(3830558, 0x5af11551e8d46795b0
9842754d01ee234999d37bcc3f9123b61bd8e17ee4d214)}: blockchain_tree: return=Ok(Valid)
2023-07-05T15:43:24.310445Z TRACE on_new_payload{block_hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214 block_number=3830558 is_pipeline_idle=true}:try_insert_new_payload: consensus::engine: return=Ok(PayloadStatus { status: Valid, la
test_valid_hash: Some(0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214) })
2023-07-05T15:43:24.310455Z TRACE on_new_payload{block_hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214 block_number=3830558 is_pipeline_idle=true}: consensus::engine: Returning payload status status=Ok(PayloadStatus { status: Valid,
latest_valid_hash: Some(0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214) })
2023-07-05T15:43:24.310499Z  INFO reth::node::events: Block added to canonical chain number=3830558 hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214
2023-07-05T15:43:24.316138Z TRACE consensus::engine: Received new forkchoice state update state=ForkchoiceState { head_block_hash: 0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214, safe_block_hash: 0x9e143153ffbf6ce543c4582ab06bf6173f70391
30be6e45704219213347cc0aa, finalized_block_hash: 0x197b7fcc576be63a13be82fca38f13f6ed49228522033680f65a3e93799882c8 }
2023-07-05T15:43:24.316175Z  INFO make_canonical{block_hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214}: blockchain_tree: Committing new canonical chain: [(3830558, 0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214)]
2023-07-05T15:43:24.375528Z DEBUG consensus::engine: canonicalized new head hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214 number=3830558
2023-07-05T15:43:24.377645Z TRACE consensus::engine: Returning forkchoice status status=PayloadStatus { status: Valid, latest_valid_hash: Some(0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214) } state=ForkchoiceState { head_block_hash: 0x5
af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214, safe_block_hash: 0x9e143153ffbf6ce543c4582ab06bf6173f7039130be6e45704219213347cc0aa, finalized_block_hash: 0x197b7fcc576be63a13be82fca38f13f6ed49228522033680f65a3e93799882c8 }
2023-07-05T15:43:24.377685Z DEBUG pruner: Minimum pruning interval reached last_pruned_block_number=Some(3830556) tip_block_number=3830558
2023-07-05T15:43:24.377703Z  INFO reth::node::events: Forkchoice updated head_block_hash=0x5af11551e8d46795b09842754d01ee234999d37bcc3f9123b61bd8e17ee4d214 safe_block_hash=0x9e143153ffbf6ce543c4582ab06bf6173f7039130be6e45704219213347cc0aa finalized_block_ha
sh=0x197b7fcc576be63a13be82fca38f13f6ed49228522033680f65a3e93799882c8 status=Valid
2023-07-05T15:43:24.377753Z TRACE consensus::engine: Pruner started tip_block_number=3830558
2023-07-05T15:43:24.377892Z TRACE consensus::engine: Pruner finished result=Ok(())

@shekhirin shekhirin added C-enhancement New feature or request A-consensus Related to the consensus engine A-pruning Related to pruning or full node labels Jul 3, 2023
@shekhirin shekhirin force-pushed the alexey/background-pruning-task branch from 99c41ec to 4f10a2c Compare July 3, 2023 21:05
@codecov
Copy link

codecov bot commented Jul 4, 2023

Codecov Report

Merging #3566 (09d25b7) into main (65b07b9) will increase coverage by 0.06%.
The diff coverage is 81.77%.

Impacted file tree graph

Impacted Files Coverage Δ
bin/reth/src/node/mod.rs 10.10% <0.00%> (-0.11%) ⬇️
crates/blockchain-tree/src/config.rs 57.14% <0.00%> (ø)
crates/consensus/beacon/src/engine/error.rs 33.33% <ø> (ø)
crates/consensus/beacon/src/engine/metrics.rs 100.00% <ø> (ø)
crates/consensus/beacon/src/engine/sync.rs 80.23% <0.00%> (ø)
crates/interfaces/src/blockchain_tree/mod.rs 42.10% <ø> (+10.52%) ⬆️
crates/primitives/src/stage/checkpoints.rs 83.71% <0.00%> (ø)
crates/prune/src/error.rs 0.00% <0.00%> (ø)
crates/storage/provider/src/providers/mod.rs 11.55% <62.50%> (+2.48%) ⬆️
crates/consensus/beacon/src/engine/mod.rs 75.80% <81.81%> (+0.30%) ⬆️
... and 5 more

... and 22 files with indirect coverage changes

Flag Coverage Δ
integration-tests 15.83% <0.00%> (-0.02%) ⬇️
unit-tests 64.30% <81.77%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
reth binary 26.28% <0.00%> (-0.04%) ⬇️
blockchain tree 82.46% <94.44%> (+0.88%) ⬆️
pipeline 87.12% <ø> (+0.01%) ⬆️
storage (db) 73.66% <62.50%> (-0.02%) ⬇️
trie 94.66% <ø> (ø)
txpool 48.78% <ø> (-0.82%) ⬇️
networking 77.89% <ø> (+0.06%) ⬆️
rpc 58.40% <ø> (+0.17%) ⬆️
consensus 64.29% <83.91%> (+0.89%) ⬆️
revm 34.86% <ø> (ø)
payload builder 6.83% <ø> (ø)
primitives 88.35% <0.00%> (+0.09%) ⬆️

@shekhirin shekhirin force-pushed the alexey/background-pruning-task branch 4 times, most recently from fd20012 to 67b745e Compare July 4, 2023 11:13
@shekhirin shekhirin force-pushed the alexey/background-pruning-task branch from 67b745e to 6ca076b Compare July 4, 2023 11:14
@shekhirin
Copy link
Collaborator Author

@mattsse

because I'm a bit confused by the check_tip, if it only checks the tip, then it doesn't need the stream of new blocks?

this is correct now, yes. We've switched from a background always-running task to a spawnable from engine, so we know the tip block number now, and can pass it to the pruning task. Fixed.

why do we need to do this? why can't the pruner just live on its own? because blocking?

yes, the pruning is blocking and the workload is similar to pipeline's as it will do a lot of db writes.

I personally very much dislike the pattern where we need to spawn the object and then receive it back eventually. we had to do this for the pipeline because we made all of its stages async.

Pruner::run needs to take a mutable self, so we need to give the blocking task exclusive mutable reference of Pruner, and such approach with Idle(Option<Pruner>) looks like a good solution for me. Also, that way we would have it similar to how sync controller spawns the pipeline, hence it simplifies the understanding of the codebase.

But if we spawn prune jobs based on block number, does it even need to be async?

it doesn't seem now, so I made Pruner::run sync.

@shekhirin shekhirin requested a review from mattsse July 5, 2023 18:54
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some questions re poll logic

crates/consensus/beacon/src/engine/mod.rs Outdated Show resolved Hide resolved
crates/consensus/beacon/src/engine/mod.rs Outdated Show resolved Hide resolved
crates/consensus/beacon/src/engine/mod.rs Outdated Show resolved Hide resolved
crates/prune/src/error.rs Show resolved Hide resolved
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@joshieDo joshieDo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

in the future (wrt snapshots), we might want to extend and add another event like OnHold, when we need to prune in chunks using checkpoints

@shekhirin shekhirin enabled auto-merge July 11, 2023 10:01
@shekhirin shekhirin disabled auto-merge July 11, 2023 17:28
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +1062 to +1064
fn update_tree_on_finished_pruner(&mut self) -> Result<(), Error> {
self.blockchain.restore_canonical_hashes()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having this extra function so we can extend this if we want to call finaluze_block eventually

@shekhirin shekhirin added this pull request to the merge queue Jul 12, 2023
@shekhirin
Copy link
Collaborator Author

finally merging 🫡

Merged via the queue into main with commit dbafe23 Jul 12, 2023
@shekhirin shekhirin deleted the alexey/background-pruning-task branch July 12, 2023 12:38
merklefruit pushed a commit to merklefruit/op-reth-old that referenced this pull request Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-consensus Related to the consensus engine A-pruning Related to pruning or full node C-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants