Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Heads up) Switch to SHA256 #612

Closed
JustinDrake opened this issue Feb 12, 2019 · 12 comments
Closed

(Heads up) Switch to SHA256 #612

JustinDrake opened this issue Feb 12, 2019 · 12 comments
Labels

Comments

@JustinDrake
Copy link
Collaborator

JustinDrake commented Feb 12, 2019

Hash function compatibility between Eth1 and Eth2 is important for several reasons:

  • Eth1 deposits into Eth2
  • Eth1 finalisation using Eth2 (unlocking two-way transfers of ETH)
  • Eth1 execution engines using Eth2 data availability

In December 2018 we ditched Blake2b on Eth2 because of incompatibility with Eth1. In doing so we fell back to Eth1's native Keccak256. It turns out Eth1 has a SHA256 precompile. This opens up the possibility to use SHA256 on Eth2. Below is a breakdown of the pros and cons of SHA256 vs Keccak256.

Pros

  • Interoperability: SHA256 is the de facto blockchain standard. It is adopted by Bitcoin, Filecoin, Algorand, Chia, Dfinity, Cosmos, Bitcoin Cash, Litecoin, EOS, Tron, etc. (A notable exception is Polkadot which is planning to use Blake2b and xxHash. This is somewhat ironic as a key selling point of Polkadot is interoperability.)
  • Speed: SHA256 is ~50% faster than Keccak256. As pointed out by @zmanian, hardware support for SHA256 is improving, with speeds approaching the performance of Blake2.

Cons

  • Gas: The SHA256 precompile is twice the gas cost of the Keccak256 opcode. (This is arguably a mispricing because SHA256 is faster than Keccak256.)
  • Length extension: Unlike Keccak256, SHA256 does not provide built-in protection against length extension attacks. Various protections are possible.
  • Strength reduction: As pointed out by @mratsim's SHA256 is classified as having a "minor weakness" (see here and here), unlike Keccak256.

The goal of this issue is to provide a heads-up and encourage discussion. My personal gut feel is that interoperability alone outweighs the cons.

@axic
Copy link
Member

axic commented Feb 12, 2019

As a side note I'd put that "ewasm on eth1.x" proposes to introduce blake2 support on Eth1, which may happen in time before the release of Eth2. This might be a point to consider.

@mratsim
Copy link
Contributor

mratsim commented Feb 12, 2019

I don't have a strong preference but if I had to vote it would be against.

  1. When looking into those timelines, I feel like SHA2-256 will have critical issues soon.

screenshot_20190212_235920

Source: http://valerieaurora.org/hash.html

  1. Keccak256 vs SHA3-256 is already confusing in Eth1 and now we're adding SHA2-256.

@mkalinin
Copy link
Collaborator

  • Speed: SHA256 is ~50% faster than Keccak256.

From #218:

The performance benefits of Blake cannot be relied upon because STARK/SNARK-friendly hashes will likely be no faster than SHA3.

Is that assumption was changed anyhow?

@JustinDrake
Copy link
Collaborator Author

Is that assumption was changed anyhow?

Speed of hash functions can be evaluated in different contexts. In the "plain-text model" (i.e. naive execution) SHA256 is faster than Keccak256. In the context of MPCs/SNARKs/STARKs all binary hash functions (SHA2, SHA3, Blake) are pretty terrible.

@JustinDrake
Copy link
Collaborator Author

When looking into those timelines, I feel like SHA2-256 will have critical issues

I'd say that SHA2 only needs to survive another 5-10 years. The reason is that we intend to migrate to a STARK-friendly hash function when we make the cryptographic primitives quantum-secure with STARKs. Are there cryptoanalysts who believe SHA2-256 will be broken within 10 years?

@zmanian
Copy link

zmanian commented Feb 13, 2019

I'd say most cryptographers think the SHA2 breaking is less likely than significant improvements in classical solutions to the discrete log problem or composite prime factoring.

Here was our reasoning for choosing SHA256 in the Cosmos Merkle Tree.
cosmos/iavl#38

I'd say the right mix of hash functions in any blockchain protocol is SHA256 and a generic function SPONGE function from KECCACK family for Merlin. https://docs.rs/merlin/1.0.2/merlin/

@JustinDrake
Copy link
Collaborator Author

a generic function SPONGE function from KECCACK family for Merlin

Why not use SHA256 for Merlin?

@zmanian
Copy link

zmanian commented Feb 13, 2019

So Merkle Damgard style hash functions accumulate input and then produce 1 output.

Sponge constructions allow you to put in some input and take out some output and then put in some more input and then get some more output etc.

What's interesting about the KECCAK family is not when you are using them in the same way as Merkle Damgard hashes but when you are using the unique properties of it's SPONGE construction.

@zmanian
Copy link

zmanian commented Feb 13, 2019

This is mentioned in the IAVL thread but SHA256 is getting support in future hardware Intel processors which makes it as fast as BLAKE2.

@benjaminion
Copy link
Contributor

Interesting suggestion from my colleague, Nicolas Gailly: https://multiformats.io/multihash/

This wouldn't be supportable natively in Eth1: the deposit contract would have to prepend the metadata to the hashes it outputs for the Merkle path, but that's relatively easy. This would give us good agility around hash functions for the foreseeable future.

It doesn't necessarily give us interop with other chains out-of-the-box, but could make that realistic with a simple shim layer to insert the appropriate hash metadata. Then we could interoperate with any chain using any of the hashes we choose to implement in the client.

@spble
Copy link
Contributor

spble commented Feb 15, 2019

Just thought I'd weigh in with some thoughts. It seems that this problem can be viewed via one of two lenses:

  • Compatibility
  • Future-proofing

Compatibility: The advantages to using SHA256 for the sake of compatibility are clear, as are are the implementation/speed advantages over Keccak (e.g. Intel's instruction sets). Backwards compatibility is the reason blake2b has been decided against for Eth2.0, however there are some indicators that suggest Eth1 will eventually be able to compute blake2b efficiently, at which point it would make a lot of sense for Eth2.0 to use blake2b.
While maintaining backwards compatibility is clearly essential, I believe one of the Eth2.0 project goals is to pave the way for a better Ethereum overall; which I think means that it should exert a positive influence on Eth1. If we think that blake2b is a better function, I would hope that Eth1 can adapt in due course.
From my understanding, the main reason that using a hashing algorithm which is inefficient in Eth1, such as blake2b, is that it inhibits the ability to move data/ether from Eth2.0 back into Eth1; i.e. it prevents Eth1 from reading/verifying the Eth2.0 state. I imagine that regardless of the hash function, Eth1 will require an update before it can perform verification in any case, and so introducing a more efficient hash function in the same update will be comparatively easier. Also, this would only need to happen once Eth2.0 is well established, and Eth1 decides to support it.

Future-Proofing: Since Bitcoin's PoW mechanism uses SHA256, the global potential/expected hashes-per-second rate for SHA256 is far higher than any other hash function. There exist clear incentives for developing faster hashing, and more sophisticated attacks for, SHA256; demonstrated so far by Bitcoin's historical hashrate. I think that this increases the likelihood of hash collisions in the long-term. As such, I don't think it's a good long-term strategy to depend solely on SHA256. Keccak256 however, does not currently suffer from this issue, and so I would consider it slightly more future-proof than SHA256, but much less than blake2b.
I also don't believe the length-extension attack is a disadvantage of SHA256 in our use case, reasons for which have been explained by @zmanian.

Personally, I feel the goal of choosing the correct technology with good future-proofing is more important than maintaining compatibility. In this case, I think blake2b is the best choice, and I can understand why Polkadot made this choice. However, if we have collectively decided to not use blake2b, then I don't think the future-proofness of Keccak256 outweighs the compatibility advantages of SHA256.

I think @benjaminion's suggestion of using multihash is a fantastic one. The implementation overhead for supporting multiple hashes in this format is negligible and it means we get the ultimate flexibility in choosing hash functions on-the-fly. I imagine that this would mean particular blocks or shards could select a hash function according to their goals. Maybe, in the beginning, only certain blocks need to be verified by Eth1, and those blocks can simply choose SHA256, while others can choose blake2b, thereby having selective interoperability and allowing us to adjust the slider between compatibility/future-proofing as we go. It also means that if vulnerabilities are discovered in any hashing algorithm, deprecating a function would be considerably easier. Further, multihash is maintained by Protocol Labs, so I assume it would have good support in libp2p.

Summary:
I prefer:

  1. blake2b
  2. SHA256
  3. Keccak256

But I don't think we should choose now and we should instead support multihash and allow any secure hashing algorithm.

@JustinDrake
Copy link
Collaborator Author

I asked Dan Boneh

Do you think SHA256 will plausibly remain secure until 2030? What about 2040? SHA256 seems to be the "blockchain standard" we are inclined to favour, but concerns around its security have been raised. In particular, we are aware of the length extension attack on SHA256, and this website suggests "minor weaknesses".

and he responded

I guess you are asking about the collision resistance of SHA-256. There is nothing known about the full SHA-256 that is better than the birthday bound. Assuming no algorithmic improvements, and assuming Moore's law continues (a big assumption) then one could expect a collision to be found in about 75 years, which seems fine for your applications. Quantum attacks also do not affect collision resistance.

However, the fact that NIST put out SHA-3 may suggest that there are non-public attacks against SHA-256 that are better than the birthday attack. This is just speculation. We have no information about this.

JustinDrake added a commit that referenced this issue Mar 15, 2019
SHA256 is de facto blockchain standard. Standardisation of the hash function is a prerequisite for [full standardisation of BLS12-381 signatures](#605). Blockchain projects are likely to provide a cheap SHA256 opcods/precompile, and unlikely to provide a Keccak256 equivelent. (Even WASM-enabled blockchains are likely to provide a SHA256 opcode/precompile since WASM does *not* natively support optimised SHA256 CPU instructions.) With Ethereum 2.0 embracing SHA256 the wider industry is more likely to converge towards a unified cross-blockchain communication scheme via Merkle receipts.

There are no security blockers with SHA256 (see comments by Dan Boneh [here](#612 (comment))).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants