-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile time SHA256 override? #702
Comments
Last time we discussed this on IRC some people said that providing 3 functions might be a bit much and having a single sha256 function instead is better. What are your thoughts on this? |
I think you'll want this feature only if you seriously care about performance. And in this case, you probably have the more complex API with 3 functions anyway. |
Also, in practice a single function is often harder to use - if you want to hash a series of things it's nice to just throw them all into a hash engine ... if you need to create a buffer first then (depending on your language) you may need to think about buffer allocation and indexing. Another alternate might be to just conditionally compile out the sha256 functions, and require the user provide replacements with exactly the correct names in order for linking to succeed. This is what we do for the default error callbacks for example. |
I think the vast majority of users would be ones that don't want to have an extra 19kilobytes in their binary.
That would be ducky too. |
Unless anyone already works on this I'd like to give it a try :) |
Could you look into this? |
IIRC the problem I encountered was the SHA2 context.
Any feedback on which is best or maybe a different solution is appreciated. |
One possibility is keeping the SHA256 padding/chopping into blocks on the secp256k1 side, and only require external implementations to provide a SHA256 transformation function. The API would look like: /** Update the state (array of 8 uint32_t) pointed to by state with 64*blocks bytes of input pointed to by data. */
void sha256_transform(uint32_t *state, size_t blocks, unsigned char *data); This means the external implementation may need to copy data from/to "array of 8 uint32_t" to their own internal representation - but I suspect almost any C-like implementation already uses that representation anyway. |
That's an interesting idea. And is there any real advantage with passing a bunch of blocks at the same time? won't it just call the transform function in a loop? or is there some SIMD/SHA-NI magic this enables? |
I played with this, it works pretty nice, but then I realized that one of the reasons were binary size, so I measured using counting asm lines in godbolt(I'm not sure if it's a good way to compare), with with Anyone who wants to play with it: https://godbolt.org/z/MPF3w9 |
For binary size you can really just look at the file size of the binary, see #700 (comment).
If at all, I think we would want the opposite. We'll need SHA256 for Schnorr sigs and ECDH, but HMAC/RFC6979 is only used for deriving nonces, and you can use SHA256, too. |
I just had to expose some end result through godbolt to get asm, I used RFC6979 because it doesn't have any complex logic and it uses SHA256 in a complex way that prevent inlining everything and optimizing write together with finalize and initialize (so it's an example to show the asm code of SHA2 not of RFC6979) |
This really depends on whether we have a compile-time override or a runtime override but we should think about a self-test in the spirit of https://github.com/bitcoin/bitcoin/blob/99813a9745fe10a58bedd7a4cb721faf14f907a4/src/crypto/sha256.cpp#L465 . |
I think the main advantage is actually code size, because my tests don't show big advantages for optimized sha2 implementations, although I haven't tested with SHA-NI but I don't think it's going to be a huge improvement in terms of performance, as SHA2 is really fast compared to anything EC related |
Im making embedded library containing libsecp256k1, see https://github.com/switck/libngu thoughts:
RE: sharing SHA256 code
|
@elichai Any update here? Maybe, if you currently don't have a lot of time, it would still be good to share your WIP branch, so someone else could adopt the PR. |
Sadly I can't find it :(
Do you know if implementations would want to keep the state as a |
Going a step back, I believe that these are separate concerns. As per the discussion here, an override helps mostly to save space. If we want to profit from hardware optimizations, this is somewhat orthogonal. An override would help here but only if the caller controls the compilation. This is true for Bitcoin Core but in general it's rather an exception. Also, if you look at Core's SHA256 implementation (https://github.com/bitcoin/bitcoin/blob/master/src/crypto/sha256.cpp), it's not clear how the override would look like. Depending on the available hardware, the essential function is either This is an argument in favor of optimizing our SHA256 implementation (independent of the possibility to override), and I tend to believe that this is desirable. We didn't want to do this in the past because SHA256 was an implementation detail. But as it's an integral part of Schnorr verification and signing now, things have changed. It's just somewhat messy: We'd need to duplicate code from Core (and worse, change it to C). Or move SHA256 (and maybe other stuff such as ChaCha20 for #760) to a separate library that's linked to secp256k1 and Core. Or expose the SHA256 here and make Core use them. None of this sounds great. Thoughts? |
I'd very much like being able to plug in a different sha256 implementation to save binary space. |
It's a bit silly that the library packs in its own sha256 even when the user has their own and on small devices it's a costly waste of space. In places like Bitcoin Core the environment has a fast SIMD or SHA-NI sha256, ... it actually makes a measurable difference in signing time to use SHA-NI.
It would be nice if there were some -Dlibsecp2561k_sha256init=x -Dlibsecp2561k_sha256update=y -Dlibsecp2561k_sha256final=z settings that could be used to compile time substitute in another library and leave the internal one out.
The text was updated successfully, but these errors were encountered: