-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing EIP-4844 transaction validation for mempool (using KZG proofs) #5088
Conversation
Great to see that this optimization was implemented |
FWIW, I'm working on some explanation and code for the optimizations discussed at the end of the post. WIP. |
6dae5c8
to
b967f25
Compare
In the previous post of this PR, we mentioned a possible optimization that allows us to verify any number of blob commitments in approximately the same time it takes us to verify a single blob commitment. We just pushed two commits in this PR that implement the technique (603ab2a and 7d3b449) so let's dive into how it all works. spoiler: it's all about taking random linear combinations and KZG having a neat algebraic structure We will demonstrate the technique for a transaction with two blobs but the technique can be generalized to an arbitrary amount of blobs (e.g. Here is a transaction with two blobs (and hence two commitments and two proofs) arranged in matrix form:
The previous post's verification logic would go over each blob and verify its commitment using the corresponding proof. The time to verify each proof is 2.5ms and hence this would take us 2.5 ms * 2 = 5ms. Let's get a bit deeper now and see how we can minimize the verification cost. The rough idea is that instead of verifying each blob individually, we combine all blobs into a single aggregated blob using a random linear combination. This new aggregated blob corresponds to a single polynomial of degree 4 (let's call it In the diagram below, we show how we produce the aggregated blob and aggregated commitment via a random linear combination using a random scalar So right now we have a single aggregated blob that we need to check against this new aggregated commitment. In terms of security, the fact that we used a random linear combination, means that checking the aggregated commitment against the aggregated blob is practically equivalent to checking each individual blob against its individual commitment. This is a classic argument in cryptography when aggregating proofs; for similar situations see the The only missing part now is how to actually check the aggregated commitment against the aggregated blob. To do this verification, the transaction creator includes an aggregated proof on the transaction which gets verified by the verifier using the Barycentric formula technique of the previous post. There is no need for individual blob proofs anymore, and hence the transaction looks a bit like this:
To summarize, the verifier creates an aggregated commitment and an aggregated blob using a random linear combination and verifies them using a provided aggregated proof. Now let's analyze the computational cost of the above procedure:
The computational cost is dominated by the KZG proof verification, and hence we expect the total procedure to take about 3.5ms (benchmarks pending). This pretty much brings the cost of verifying EIP-4844 blocks and transactions to a near-optimal level, since now instead of verifying a linear number of KZG proofs, we just verify a single KZG proof, and instead do linear finite field operations (which are cheap). Post written with the invaluable help of @adietrichs. Code once again stolen from @dankrad's danksharding code with minor modifications. |
201a077
to
6df884c
Compare
All code pretty much straight up copied from ethereum/EIPs#5088
All code pretty much straight up copied from ethereum/EIPs#5088
All code pretty much straight up copied from ethereum/EIPs#5088
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments to assist Dankrad with code review, by highlighting the differences between this code and the Danksharding PR.
EIPS/eip-4844.md
Outdated
Compute the modular inverse of x using the eGCD algorithm | ||
i.e. return y such that x * y % BLS_MODULUS == 1 and return 0 for x == 0 | ||
""" | ||
if x == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difference from danksharding PR: Check for x == 0
return x * inv(y) % MODULUS | ||
|
||
|
||
def evaluate_polynomial_in_evaluation_form(poly: List[BLSFieldElement], x: BLSFieldElement) -> BLSFieldElement: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difference from danksharding PR: Switch barycentric code with this one from research repo: https://github.com/ethereum/research/blob/master/verkle_trie/kzg_utils.py#L35
The barycentric formula code from the danksharding PR was not giving the right results based on some rudimentary tests.
@@ -46,6 +46,7 @@ Compared to full data sharding, this EIP has a reduced cap on the number of thes | |||
| `BLS_MODULUS` | `52435875175126190479447740508185965837690552500527637822603658699938581184513` | | |||
| `KZG_SETUP_G2` | `Vector[G2Point, FIELD_ELEMENTS_PER_BLOB]`, contents TBD | | |||
| `KZG_SETUP_LAGRANGE` | `Vector[KZGCommitment, FIELD_ELEMENTS_PER_BLOB]`, contents TBD | | |||
| `ROOTS_OF_UNITY` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB]` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difference from danksharding PR: Add the roots of unity list as a global constant instead of having explicit code that generates them on demand.
3cb77a8
to
08e5f92
Compare
There is no `tx.message.blob_commitments` anymore, or `kzg_to_commitment()`
To validate a 4844 transaction in the mempool, the verifier checks that each provided KZG commitment matches the polynomial represented by the corresponding blob data. | d_1 | d_2 | d_3 | ... | d_4096 | -> commitment Before this patch, to do this validation, we reconstructed the commitment from the blob data (d_i above), and checked it against the provided commitment. This was expensive because computing a commitment from blob data (even using Lagrange basis) involves N scalar multiplications, where N is the number of field elements per blob. Initial benchmarking showed that this was about 40ms for N=4096 which was deemed too expensive. For more details see: https://hackmd.io/@protolambda/eip-4844-implementer-notes#Optimizations protolambda/go-ethereum#4 In this patch, we speed this up by providing a KZG proof for each commitment. The verifier can check that proof to ensure that the KZG commitment matches the polynomial represented by the corresponding blob data. | d_1 | d_2 | d_3 | ... | d_4096 | -> commitment, proof To do so, we evaluate the blob data polynomial at a random point `x` to get a value `y`. We then use the KZG proof to ensure that the commited polynomial (i.e. the commitment) also evaluates to `y` at `x`. If the check passes, it means that the KZG commitment matches the polynomial represented by the blob data. This is significantly faster since evaluating the blob data polynomial at a random point using the Barycentric formula can be done efficiently with only field operations (see https://hackmd.io/@vbuterin/barycentric_evaluation). Then, verifying a KZG proof takes two pairing operations (which take about 0.6ms each). This brings the total verification cost to about 2 ms per blob. With some additional optimizations (using linear combination tricks as the ones linked above) we can batch all the blobs together into a single efficient verification, and hence verify the entire transaction in 2.5 ms. The same techniques can be used to efficiently verify blocks on the consensus side.
Also abstract `lincomb()` out of the `blob_to_kzg()` function to be used in the verification.
08e5f92
to
ea9ae70
Compare
All tests passed; auto-merging...(pass) eip-4844.md
|
Rebased and force pushed because of conflicts with #5106 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triggering bot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some implementation issues.
assert width == FIELD_ELEMENTS_PER_BLOB | ||
inverse_width = bls_modular_inverse(width) | ||
|
||
for i in range(width): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize r
for i in range(width): | |
r = 0 | |
for i in range(width): |
|
||
### Helpers | ||
|
||
Converts a blob to its corresponding KZG point: | ||
|
||
```python | ||
def lincomb(points: List[KZGCommitment], scalars: List[BLSFieldElement]) -> KZGCommitment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's in EL's point of view, this EIP uses SSZ to define the parameters and constants. It would be better to use the more general Sequence
type so that (i) it can accept all sequence types (basic Python sequence and SSZ sequence), (ii) less confusion, and (iii) be similar to CL specs.
def lincomb(points: List[KZGCommitment], scalars: List[BLSFieldElement]) -> KZGCommitment: | |
def lincomb(points: Sequence[KZGCommitment], scalars: Sequence[BLSFieldElement]) -> KZGCommitment: |
return x * bls_modular_inverse(y) % BLS_MODULUS | ||
|
||
|
||
def evaluate_polynomial_in_evaluation_form(poly: List[BLSFieldElement], x: BLSFieldElement) -> BLSFieldElement: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
def evaluate_polynomial_in_evaluation_form(poly: List[BLSFieldElement], x: BLSFieldElement) -> BLSFieldElement: | |
def evaluate_polynomial_in_evaluation_form(poly: Sequence[BLSFieldElement], x: BLSFieldElement) -> BLSFieldElement: |
current_power = current_power * int(x) % BLS_MODULUS | ||
return powers | ||
|
||
def vector_lincomb(vectors: List[List[BLSFieldElement]], scalars: List[BLSFieldElement]) -> List[BLSFieldElement]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def vector_lincomb(vectors: List[List[BLSFieldElement]], scalars: List[BLSFieldElement]) -> List[BLSFieldElement]: | |
def vector_lincomb(vectors: Sequence[Sequence[BLSFieldElement]], scalars: Sequence[BLSFieldElement]) -> Sequence[BLSFieldElement]: |
Given a list of vectors, compute the linear combination of each column with `scalars`, and return the resulting | ||
vector. | ||
""" | ||
r = [0]*len(vectors[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r = [0]*len(vectors[0]) | |
r = [0] * len(vectors[0]) |
number_of_blobs = len(blobs) | ||
|
||
# Generate random linear combination challenges | ||
r = hash_to_bls_field([blobs, commitments]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hash_to_bls_field
accepts Container
. I think it needs to define a Container and cast a type here.
class BlobsAndCommmitments(Container):
blobs: List[Blob, MAX_BLOBS_PER_BLOCK]
blob_kzgs: List[KZGCommitment, MAX_BLOBS_PER_BLOCK]
and do
r = hash_to_bls_field(BlobsAndCommmitments(blobs=blobs, blob_kzgs=commitments))
aggregated_poly = vector_lincomb(blobs, r_powers) | ||
|
||
# Generate challenge `x` and evaluate the aggregated polynomial at `x` | ||
x = hash_to_bls_field([aggregated_poly, aggregated_poly_commitment]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, need casting
…ofs) (ethereum#5088) * Fix missing variables/funcs in validate_blob_transaction_wrapper() There is no `tx.message.blob_commitments` anymore, or `kzg_to_commitment()` * Introduce KZGProof as its own type instead of using KZGCommitment * Introduce high-level logic of new efficient transaction validation To validate a 4844 transaction in the mempool, the verifier checks that each provided KZG commitment matches the polynomial represented by the corresponding blob data. | d_1 | d_2 | d_3 | ... | d_4096 | -> commitment Before this patch, to do this validation, we reconstructed the commitment from the blob data (d_i above), and checked it against the provided commitment. This was expensive because computing a commitment from blob data (even using Lagrange basis) involves N scalar multiplications, where N is the number of field elements per blob. Initial benchmarking showed that this was about 40ms for N=4096 which was deemed too expensive. For more details see: https://hackmd.io/@protolambda/eip-4844-implementer-notes#Optimizations protolambda/go-ethereum#4 In this patch, we speed this up by providing a KZG proof for each commitment. The verifier can check that proof to ensure that the KZG commitment matches the polynomial represented by the corresponding blob data. | d_1 | d_2 | d_3 | ... | d_4096 | -> commitment, proof To do so, we evaluate the blob data polynomial at a random point `x` to get a value `y`. We then use the KZG proof to ensure that the commited polynomial (i.e. the commitment) also evaluates to `y` at `x`. If the check passes, it means that the KZG commitment matches the polynomial represented by the blob data. This is significantly faster since evaluating the blob data polynomial at a random point using the Barycentric formula can be done efficiently with only field operations (see https://hackmd.io/@vbuterin/barycentric_evaluation). Then, verifying a KZG proof takes two pairing operations (which take about 0.6ms each). This brings the total verification cost to about 2 ms per blob. With some additional optimizations (using linear combination tricks as the ones linked above) we can batch all the blobs together into a single efficient verification, and hence verify the entire transaction in 2.5 ms. The same techniques can be used to efficiently verify blocks on the consensus side. * Introduce polynomial helper functions for transaction validation * Implement high-level logic of aggregated proof verification * Add helper functions for aggregated proof verification Also abstract `lincomb()` out of the `blob_to_kzg()` function to be used in the verification. * Fixes after review on the consensus PR
Hello,
this PR uses KZG proofs to speed up the procedure of validating blobs of EIP-4844 transactions in the mempool.
With the current EIP-4844 proposal it takes about 40ms to validate the blobs, and we received concerns that it would be too slow for mempool validation. With this PR we can bring the verification time of the entire transaction to about 3.5 ms regardless of the number of blobs included (also see subsequent post on this PR).
Details
To validate a 4844 transaction in the mempool, the verifier checks that each provided KZG commitment matches the polynomial represented by the corresponding blob data (see
validate_blob_transaction_wrapper()
).Before this patch, to do this validation, we reconstructed the commitment from the blob data (
d_i
above), and checked it against the provided commitment. This was expensive because computing a commitment from blob data (even using Lagrange basis) involves N scalar multiplications, where N is the number of field elements per blob.Initial benchmarking showed that this was about 40ms for N=4096 which was deemed too expensive.
In this patch, we speed this up by providing a KZG proof for each commitment. The verifier can check the proof to ensure that the KZG commitment matches the polynomial represented by the corresponding blob data.
To do so, we evaluate the blob data polynomial at a random point
x
to get a valuey
. We then use the KZG proof to ensure that the committed polynomial (i.e. the commitment) also evaluates toy
atx
. If the check passes, it means that the KZG commitment matches the polynomial represented by the blob data.This is significantly faster since evaluating the blob data polynomial at a random point using the Barycentric formula can be done efficiently using only field operations. Then, verifying a KZG proof takes two pairing operations (which take about 0.6ms each). This brings the total verification cost to about 2 ms per blob.
Drawbacks
The main drawback of this technique is that it requires an implementation of the Barycentric formula. You can see in the PR that it's not that much code, but it's still an increase in required math. All the math code in this PR have been stolen from the Danksharding PR (ethereum/consensus-specs#2792) with a few simplifications and bug fixes.
It also very slightly increases the transaction size by 48 bytes per blob (each blob is 128kb and the proof is 48 bytes of overhead).
Optimizations
There is a bunch of optimizations that can/should be done here:
We can aggregate and verify all blobs using random linear combinations to bring the validation time to 2.5 ms per transaction (instead of per blob). See my next post of this PR on how this is done.
We can apply the same technique on the consensus side, which will allow verifying the blobs of the entire block (up to 16 blobs) much more efficiently.
Also, you can see that this patch removes a call to
blob_to_kzg()
. The only other use ofblog_to_kzg()
is in the blob verification precompile. This means that if that precompile gets removed (as discussed in the past, in favor of the point verification one), we can completely removeblob_to_kzg()
and the structures associated with it.Kudos to @dankrad for suggesting this approach and for the danksharding code.
Thanks to @adietrichs for proofreading and for catching a mistake in the Fiat-Shamir computation.