Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring back signature batch verification #351

Open
bkchr opened this issue Jul 9, 2020 · 15 comments
Open

Bring back signature batch verification #351

bkchr opened this issue Jul 9, 2020 · 15 comments
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@bkchr
Copy link
Member

bkchr commented Jul 9, 2020

paritytech/substrate#6616 removed batch verification (the usage of it) as it introduced new host functions that are doing the batch verification.

We need to bring back the runtime implementation of the batch verification.

This requires a new trait:

/// A signature that supports batch verification.
trait BatchVerify: Verify {
    /// Register a signature for batch verification.
    ///
    /// This requires that batch verification is enabled by doing XYZ.
    ///
    /// Returns `true` when the signature was successfully registered for batch verification or if batch verification is not enabled
    /// the signature could be verified successfully immediately.
    ///
    /// # Warning
    /// 
    /// This requires that the batch verification is finished by calling XYZ to check the result of all submitted signature verifications.
    fn batch_verify<L: Lazy<[u8]>>(&self, msg: L, signer: &<Self::Signer as IdentifyAccount>::AccountId) -> bool;
}

This trait needs to be implemented for all the required types and we need to switch UncheckedExtrinsic to use this trait.

@NikVolf NikVolf self-assigned this Jul 13, 2020
@shawntabrizi
Copy link
Member

@NikVolf are you working on this? @bkchr do we need this at any high priority?

@bkchr
Copy link
Member Author

bkchr commented Apr 8, 2021

IMHO no. Is only relevant on block import.

@burdges
Copy link

burdges commented Apr 8, 2021

Adopt ed25519-zebbra over ed25519-dalek paritytech/substrate#8055 if you want batch verification of ed25519

I implemented half-aggregation for sr25519 w3f/schnorrkel#68 which works like sr25519 batch verification, but reduces signature size from 64 bytes to amortized 32 bytes, but requires block producers include data specific to the aggregation, and makes signatures non-relocatable. Is halving the signature storage size for signatures that do not require relocation worth the extra logic for you guys? I guess no.. or not anytime soon.

@bkchr
Copy link
Member Author

bkchr commented May 8, 2021

The general question with batch verification is still if we want the fastest approach or the energy efficiency approach ;)

The fastest approach is just to validate each signature as it comes in the background. While the energy efficiency approach is the one that uses batch verification.

I'm still more for validating each signature as it comes, to have a faster sync ;)

@burdges
Copy link

burdges commented May 8, 2021

I only envisioned batch signature verification within blocks, so in particular block seals would never be batch verified. And the PreparedBatchs implemented in schnorrkel only make sense within blocks.

In effect, one 32 (1+n) byte PreparedBatch replaces n 64 byte sr25519 signatures in the block, so then all sr25519 signed intrinsics must be pushed into a verification pipe in the same deterministic order, which then runs in another thread after parsing the PoV block.

I doubt this adds latency like you describe because we've so many storage Merkle proofs to check in other threads. I'd expect an STVF would return "structurally valid" very quickly, which then leaves the final valid result waiting upon num_cpu_cores crypto threads. And these mostly verify all the storage claims about Merkle proofs. We'd just add another couple threads doing batch signature verification.

Now if our sr25519 verification time exceeds our Merkle proof verification time divided by num_cpu_cores-1 then ideally the block producer should've placed multiple PreparedBatchs into the block and distinguished the blocks somehow. This is mildly annoying because block producers could produce blocks in a way that performs better with specific numbers of CPU cores. It's fine though so long as you impose a cap on prepared batch sizes because the prepared batch verifier terminates after a reasonable amount of time.

We could batch verify block seals too, maybe what you meant, which yes adds latency. Worse, this opens a denial of service vector, and thus requires some fallback. I doubt the CPU savings warrant the complexity there.

@bkchr
Copy link
Member Author

bkchr commented May 8, 2021

What storage merkle proofs we have to check in other threads? I'm confused. We don't use any threading at all from the runtime.

As the block producer can not predict the number of extrinsics in the block, we can not push any PreparedBatch into the block.

In general we could probably rewrite the execute_block function to collect all signatures and to batch them. This should be the fastest solution ;)

@burdges
Copy link

burdges commented May 9, 2021

If I understand, our runtime code has no idea if its host mutates a real storage or checks Merkle proofs.

In PoV verification, we have a minimized shadow of the storage attached, so parsing the block trigger another thread to check all the attached hashes, and then get host calls return immediately with the desired value because this other thread kills the block if they turn out wrong.

We could similarly make put calls return immediately too, but must enqueue rehashing its parents. We'd need logic that unifies put calls with other put calls and in future with other runtime threads, ala https://github.com/w3f/research-internal/issues/144 and paritytech/substrate#1459 All these hashes could still be checked in other threads though.

It's easier to do signatures in another thread of course. I'd previously thought Merkle proof verification should occupy more CPU than signature verification, but actually this sounds unlikely.

@stale
Copy link

stale bot commented Jul 7, 2021

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A5-stale label Jul 7, 2021
@bkchr bkchr removed the A5-stale label Nov 15, 2021
@librelois
Copy link
Contributor

@bkchr Anything new on signature batch verification?
We would really like to have this for moonbeam, as we think it would significantly reduce the execution time of a block.
If necessary, we (purestake) are ready to submit a PR that implements the design proposed in the first post of this issue.

@burdges
Copy link

burdges commented Nov 19, 2021

As an aside, I've added a "half-aggregation" PreparedBatch approach in schnorrkel. It does not speed up verification beyond regular batch verification, but saves 32 bytes of block space per signature. Alone, I doubt this matters much, and it's quite a large change, but..

You cannot extract and replay individual signatures from blocks using PreparedBatch, which

  • creates problems if you must repeat individual signatures in disputes procedures, but also
  • prevents other collators replaying signatures they were never authorized to post on-chain. It thus provides one component of a weak MEV defense system, with the later Sassafras phase of removing the memepools playing the other part.

All this is not immediately relevant I think, but maybe worth discussing this option one day, especially once we get Sassafras back on track---it being off track is entirely my fault.

@bkchr
Copy link
Member Author

bkchr commented Nov 19, 2021

@bkchr Anything new on signature batch verification? We would really like to have this for moonbeam, as we think it would significantly reduce the execution time of a block. If necessary, we (purestake) are ready to submit a PR that implements the design proposed in the first post of this issue.

Hey, we would be open for a pr for this. However, didn't you said that this doesn't work for your use case? Because you need to reconstruct the sender?

@librelois
Copy link
Contributor

Hey, we would be open for a pr for this.

Ok so maybe we'll propose one :)

However, didn't you said that this doesn't work for your use case? Because you need to reconstruct the sender?

Yes that's right, but we can extract the sender on the client side and add this information to the frontier extrinsic (made by the client from an ethereum transaction), so it's not blocking.

@bkchr
Copy link
Member Author

bkchr commented Nov 19, 2021

Okay fine :)

I think we should not use any ed25519 etc batch signature verification mechanisms for this here. While they are faster for doing batched signature verification, they require that you have all signatures "ready". Here in our case, it would just be much better to do the verification in the background. Maybe we should rename this entire feature to "Background signature verification". The advantage of this would be that we could run the extrinsic and do the signature verification at the same time :)

@librelois
Copy link
Contributor

librelois commented Nov 19, 2021

I definitely agree. Batch verification - in the cryptographic sense - is a different feature. Ideally, we would like to have both in the long run, but in the short run we propose to contribute to the background verification only.

@librelois
Copy link
Contributor

@bkchr here is a first draft for the PR, I wrote some questions in the description, I hope you can answer them: paritytech/substrate#10353

@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed J0-enhancement labels Aug 25, 2023
claravanstaden pushed a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023
* commit

* rdy

* Apply suggestions from code review

Co-authored-by: philipstanislaus <6912756+philipstanislaus@users.noreply.github.com>

* fixes

Co-authored-by: philipstanislaus <6912756+philipstanislaus@users.noreply.github.com>
helin6 pushed a commit to boolnetwork/polkadot-sdk that referenced this issue Feb 5, 2024
* Remove useless client param

* Update version

* Make compile happly
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Mar 26, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Mar 27, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024
* relay receiving + processing confirmations

* fmt && clippy

* removed message processing race

* remove more traces

* generic args names
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Backlog
7 participants