Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP4844: Add check to ensure Blobs are canonical #3057

Closed
kevaundray opened this issue Oct 23, 2022 · 8 comments
Closed

EIP4844: Add check to ensure Blobs are canonical #3057

kevaundray opened this issue Oct 23, 2022 · 8 comments

Comments

@kevaundray
Copy link
Contributor

Problem

When we receive a blob as a sequence of bytes and interpret it as a integer mod p, we do not check that the byte representation is canonical.

Example

Lets say p is 5.

If I have a byte array b1 which encodes the integer 2 and a byte array b2 which encodes the integer 7. When I convert both byte arrays to an integer mod 5, they will both produce the value 2.

This can be a problem because two different blobs will produce the same commitment.

Solution

Check that the integer mod p when converted back to a byte array, does indeed produce the original byte array.

@Inphi
Copy link
Contributor

Inphi commented Oct 24, 2022

This is sorta specified in the BLSFieldElement type description used to represent each point in a blob. While it should also be explicitly done in the executable spec, it'll depend on client implementations where to make such checks. Since the blobs are sourced from execution, clients can only trust that blobs are encoded correctly.

@kevaundray
Copy link
Contributor Author

This is sorta specified in the BLSFieldElement type description used to represent each point in a blob. While it should also be explicitly done in the executable spec, it'll depend on client implementations where to make such checks. Since the blobs are sourced from execution, clients can only trust that blobs are encoded correctly.

After 3038, is it possible that these checks are moved solely to the cryptography functions that create and verify proofs?

@Inphi
Copy link
Contributor

Inphi commented Oct 24, 2022

After 3038, is it possible that these checks are moved solely to the cryptography functions that create and verify proofs?

That won't do since the blob encoding is up to the user. For example, Blobs could be packed tightly where unused bits in the field element could be used to encode the next byte. This is a non-canonical but valid usecase.
Since encoding isn't context-free, only the user/encoder will be able to determine whether the blobs should be canonical.

@kevaundray
Copy link
Contributor Author

After 3038, is it possible that these checks are moved solely to the cryptography functions that create and verify proofs?

That won't do since the blob encoding is up to the user. For example, Blobs could be packed tightly where unused bits in the field element could be used to encode the next byte. This is a non-canonical but valid usecase.

Since encoding isn't context-free, only the user/encoder will be able to determine whether the blobs should be canonical.

Ah I didn't know that, so blobs do not need to be canonical and in fact this should not be checked by the cryptography code since it doesn't have the context to decide this.

Can you explain why it would not be a problem, if two different blobs, A and B, produced the same commitment, where in both cases the encoder is expecting a non-canonical blob?

In particular, what context can one use to determine that blob A is the correct non-canonical blob?

@Inphi
Copy link
Contributor

Inphi commented Oct 25, 2022

Yup, the cryptography code shouldn't check the encoding.

If two pieces of data, A and B, generate the same blob and thus commitment, then that's a problem with the encoding. I'd attribute that to User Error and not a real problem that can be solved by the specs.

@Inphi
Copy link
Contributor

Inphi commented Oct 25, 2022

What we're trying to solve with "Data Availability" is really "Blob Availability". (and maybe we should rename it for accuracy :-)

@protolambda
Copy link
Collaborator

I don't like the idea of quietly applying a modulus or bit truncation on input that's otherwise invalid, +1 on making the the input validation on crypto functions strict. The cryptography code maybe shouldn't have the responsibility to actually perform the checks if we can demand valid inputs from the user before the crypto function is called, but we should make the specs strict on input.

@kevaundray
Copy link
Contributor Author

Closing as #3038 has been merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants