Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MiMC Hashing #119

Closed
fredlacs opened this issue Apr 14, 2019 · 9 comments
Closed

MiMC Hashing #119

fredlacs opened this issue Apr 14, 2019 · 9 comments

Comments

@fredlacs
Copy link

Hey Harry,

Interesting work you've got going here!
I was looking at using your Solidity implementation of MiMC together with circomlib's javascript implementation, but it doesn't seem to give the output I expected, as seen here.

From what I've seen so far both use the same seed "mimc" and same number of rounds. I tried playing around with the inputs a bit but out of luck so far.

Could you help me out? :)

@HarryR
Copy link
Owner

HarryR commented Apr 17, 2019

Hi

There is possibly one difference between the implementations: MiMC.sol uses the 'Miyaguchi–Preneel' compression function construct, and the MiMC from https://github.com/iden3/circomlib/blob/77928872169b7179c9eee545afe0a972d15b1e64/src/mimc7.js#L47 uses Merkle-Damgard construct.

The difference being that with Merkle-Damgard construct the previous output is used for the next key.

And Miyaguchi-Preneel is an extension of this construct which mixes the previous output and current message into the output of the function.

Which one should we use? Because one of the projects needs to change it in order to be compatible with the other. I think either Miyaguchi-Preneel or Davies-Meyers should be used as they are more directly proved using the ideal cipher model (which MiMC seems to resemble very closely)

For more information, see:

Please note though, that padding isn't applied to the message, and special consideration needs to be taken.

@fredlacs
Copy link
Author

fredlacs commented Apr 20, 2019

Thank you for the great answer @HarryR . I believe the circom team can provide a much better answer than myself! I went ahead and opened a issue in their repo :)
iden3/circom#29

@barryWhiteHat
Copy link

Seems like this is a natural thing to standardize. Lets check with the circom team :)

@jbaylina
Copy link

I'm not qualified to analyse from the security perspective which option is better. Would be great to have the insights of an expert on the subject.

From the implementation point of view, Miyaguchi–Preneel requires an XOR and a key adoption (function g ) in each step. This makes absolutely no sense to have it inside the snark. We go from 1 or 2 constraints per cicle to more than 250 constraints per cicle in the best case.

We may change the XOR for a Field addition, but I believe it's an ERROR from the security perspective to do it. But in any case, it would be very convenient this to be decided by an expert on the subject.

@HarryR
Copy link
Owner

HarryR commented Apr 21, 2019

From the implementation point of view, Miyaguchi–Preneel requires an XOR and a key adoption (function g ) in each step. This makes absolutely no sense to have it inside the snark. We go from 1 or 2 constraints per cicle to more than 250 constraints per cicle in the best case.

The key-adaption function g is only necessary to transform keys of different sizes into the size required by the PRP, in the case of MiMC the output is the same size as the key.
Secondly, the XOR operation is equivalent to addition over the field. What do you mean by ERROR?

There are two relevant papers:

In PGV'93 (fig 1, pg 3), they describe the general model for a one-way compression function built upon a block cipher.

The construct used by Circom is:

  • P = X_i
  • K = H_{i-1}
  • FF = V

Where V is essentially zero, so it doesn't modify the output. That is, the previous output is used as the key for the next input, the output (C) is not modified as FF=V.

Subsequently in §4 Table 1, it describes the attacks based on the choice of FF, K and P. With P = X_i, K = H_{i-1} and FF=V it shows that the scheme (1) is susceptible to a direct attack. The attacks are described in §3 pg 3 (A Taxonomy of Attacks).

Direct Attack (D): given H_{i−1} and H_i, it is easy to find X_i. All schemes that are vulnerable to a direct attack can in principle be used for encryption, where the encryption of X_i is given by H_i. Of course the CBC and CFB mode belong to this class.
... The order of these attacks has some importance: the possibility of a direct attack
means that a forward and a backward attack are also feasible, but the converse does not hold.

That is to say, the mechanism used by Circom is susceptible to the Direct Attack and all subsequent attacks.

The Miyaguchi–Preneel construct used in Ethsnarks has the following parameters:

  • P = X_i
  • K = H_{i-1}
  • FF = X_i + H_{i-1}

Referring back to the table, this is scheme 10, which has the checkmark:

If none of these five attacks applies, a √ is put in the corresponding entry.

Which is to say, categorically, that the Miyaguchi-Preneel construct is not vulnerable to any of the attacks described in the PGV'93 paper, it is one of the 4 one-way-compression-functions analysed where none of the five attacks apply, with the remaining being 'insecure' to some degree. With 12 overall being generally considered secure.

The BRS'02 paper takes the PGV paper as a starting point for further analysis, providing explicit upper and lower bounds for both collision-resistance and inversion-resistance in a black-box model and a lot more insight.

Based on this analysis, my conclusion is that Circom needs to adopt the Miyaguchi-Preneel scheme, and there is no need to change the Ethsnarks implementation as it is optimal.

@jbaylina
Copy link

I'm saying ERROR because I fill that changing a cryptographic algorithm is an ERROR. An XOR is not the same that a field addition.

There is one paragraph in the paper that gives me some hope:
Theexoroperationwas chosenbecauseit has been usedin the proposalsthatare generalizedhere;one can show thatit can be replacedby any operationthatis an easy-to-invert permutationof one of its inputswhenthe secondinputis Øxed.Themainrestrictionsof thismodel are thatonly1 DESoperationisusedper roundfunctionand thatthe internalmemoryof the hashfunctionisrestrictedto a singlen-bit block

But I'm still not sure.

In any case, adapting the CIRCOM circuit to this configuration with the field addition instead of the XOR, it is easy. In this case, function g is not required.

@HarryR
Copy link
Owner

HarryR commented Apr 21, 2019

The XOR operation is equivalent to adding the coefficients in GF(2) together for two polynomials representing GF(2^n).

123 = 01111011
240 = 11110000
139 = 10001011
123^240 = 139

Where each n bits are coefficients in GF(2). You can see that adding each of the bits (coefficients) together in GF(2) is equivalent to the XOR operation:

a b =
0 0 0
1 0 1
0 1 1
1 1 0

This can be generalised to both GF(p) and GF(p^n), and behaves identically (e.g. it's commutative and associative).

Not sure if explained accurately, but they are equivalent.

@barryWhiteHat
Copy link

Hmmm is the circom implementation based upon literature recommendation?

@HarryR are your concerns related to what the literature recommends or to what jordi has implemented ?

@HarryR
Copy link
Owner

HarryR commented Apr 25, 2019

The concern is this:

The MiMC paper suggests using the Sponge construct to create a hashing function from the MiMC block cipher, however the sponge construct isn't feasible to do inside a zkSNARK.

I was searching for a different way to efficiently use MiMC as a hash function and came across the Davies-Meyer and Miyaguchi–Preneel constructs, and chose to use the Miyaguchi–Preneel construct.

Circom has implemented it slightly differently, in a way which doesn't mix the input or key with the output.

The most referenced/cited papers I can find regarding the security of one-way-compression functions state that the mode used by Circom is susceptible to attacks, whereas the Miyaguchi–Preneel construct (used by ethsnarks) isn't vulnerable to any of the attacks they analyse.

However, this is a black-box analysis in the ideal cipher model, and I don't have examples of a successful attack, but it's enough for me to raise an alarm after having read those papers.

I guess the debate is that we have two slightly differing implementations which are incompatible with each other due to the choice of compression function construct, and my suggestion is to align it with the Ethsnarks implementation which uses the Miyaguchi-Preneel construct rather than the other way round, given the analysis above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants