Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent/parallel block verification #1598

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nazar-pc
Copy link
Contributor

@nazar-pc nazar-pc commented Sep 16, 2023

Block verification is meant to ideally be stateless and parallelizable, but:

  • this isn't the case for Aura and Babe consensus
  • this wasn't the case for Substate itself either before this PR

This PR is meant to be reviewed going through individual commit.

While reading code I noticed unused optional delay between block imports that originates in paritytech/substrate#3159 where the idea was to compile things for browser and this delay was meant to allow browsers to do some work in between block improts instead of locking up the page. This is long irrelevant and can be removed. Yielding remaing as it seems to be there to allow for justification imports to happen in between block imports.

Then I did some refactoring to separate sequential from potentially concurrent operations. Things like checking blocks and verifying should not depend on parent block being imported first. Since this is not really the case yet for all consensus engines Verifier::supports_stateless_verification is introduced (defaults to false) to avoid introducing new breaking logic everywhere (some background in #1302, but I then noticed more usage of parent_hash that indicates things will break badly if blocks suddenly start being verified out of order).

And lastly with all bits and pieces in place concurrent/parallel (since verification can be CPU-intensive) is implemented for verifiers that return true from Verifier::supports_stateless_verification, otherwise previous logic remains. Block import is still happening sequentially after that just like before.

P.S. Ideally blocks from independent forks would be imported concurrently as well, but that will likely require significant refactoring for not very big benefit in most cases.

@nazar-pc nazar-pc force-pushed the concurrent-verification branch 3 times, most recently from b4e8e31 to fc5048a Compare September 16, 2023 23:19
@nazar-pc
Copy link
Contributor Author

It is really hard to work with when clippy is running on (old) stable and formatting is done with nightly instead 😕

No rust-toolchain.toml file in the repo either.

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want this, we should go into the direction that we want to go in the future any way. See here for some information: #5 (comment)

TLDR on what I want to see from this pr, Verifier should be moved out of the import queue at all and sync should call it directly. sync should only pass verified blocks to the import queue.

@nazar-pc
Copy link
Contributor Author

If you really want this, we should go into the direction that we want to go in the future any way. See here for some information: #5 (comment)

I was not aware of that thread, thanks, I read it and subscribed for updates.

However it seems like what you're suggesting is a more significant change than almost trivial refactoring I have done here, which I suspect would be harder to do as an external contributor and not on Parity team.

I like the direction described, but I don't see how this PR makes moving into that direction more difficult in the meantime. Is that a hard reqirement?

TLDR on what I want to see from this pr, Verifier should be moved out of the import queue at all and sync should call it directly. sync should only pass verified blocks to the import queue.

But it would still do something similar to this PR, especially for Aura and Babe that expect even verification of blocks to happen in strictly ascending order, not just import. This was actually the more difficult part of these changes that I only realized later, our consensus in Subspace doesn't have such assumption, so it is a bit easier to work with in that sense.

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/3959915

@bkchr
Copy link
Member

bkchr commented Oct 15, 2023

However it seems like what you're suggesting is a more significant change than almost trivial refactoring I have done here, which I suspect would be harder to do as an external contributor and not on Parity team.

I don't think that this can not be done by someone external. We would first need to ensure to make all Verify implementation being stateless. So, they only check the seal. Then we could move out the verifier from the import queue and move it to the sync code. All in all this isn't such a complicated thing to do. The first part with Verify can be done in a separate pr.

@nazar-pc
Copy link
Contributor Author

I'm now wondering if the fact that things will change in the future should be a blocker for moving forward with this. It doesn't change behavior for existing users and allows for additional capabilities for those who opt-in.

I understand that you'd like to see a bigger refactoring, but I'm not sure if it should prevent making incremental improvements for existing design.

We've been using this patch in Subspace for a while and it works great, massively improving sync time for our protocol.

@nazar-pc nazar-pc requested a review from athei as a code owner January 15, 2024 10:53
@nazar-pc nazar-pc requested a review from a team January 15, 2024 10:53
@paritytech-review-bot paritytech-review-bot bot requested a review from a team January 15, 2024 10:54
@nazar-pc nazar-pc mentioned this pull request Jun 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jun 23, 2024
I carried these things in a fork for a long time, I think wouldn't hurt
to have it upstream.

Originally submitted as part of
#1598 that went nowhere.

---------

Co-authored-by: Bastian Köcher <git@kchr.de>
@nazar-pc
Copy link
Contributor Author

With latest changes from master, the diff here is quite small

TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
I carried these things in a fork for a long time, I think wouldn't hurt
to have it upstream.

Originally submitted as part of
paritytech#1598 that went nowhere.

---------

Co-authored-by: Bastian Köcher <git@kchr.de>
@nazar-pc
Copy link
Contributor Author

nazar-pc commented Sep 17, 2024

We would first need to ensure to make all Verify implementation being stateless. So, they only check the seal.

@bkchr on API level it is stateless after #4844 and consensus at Subspace has it implemented in fully stateless way too (and verified in parallel as implemented by this PR), however looking at consensus implementations in polkadot-sdk repo I see a lot of implicit dependencies on things like parent block existing to fetch authorities and I don't think I'm the best person to do something about this.

I'm wondering what would be the next step for me if I don't touch those now that I have resources to dedicate to this issue. I re-read #5 again to refresh my memory.

Depending on how things are implemented, whether to verify concurrently or not might be a downstream decision assuming API allows it. Let me know what you think.

UPD: Let me know if this is not the best place to have this conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants