-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import ML-KEM from mlkem-native/PQ code package #2041
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
7f66f23
to
274d30c
Compare
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
959c697
to
1eebf31
Compare
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @bhess. I surely didn't check all 540 files but focused on the integration logic: Please see the single comments. In general, the patch is way too large in my opinion: Isn't it possible that the upstream uses fewer hard-coded include paths and also provides a YML documentation of their implementation? "copy_from_upstream" ideally should be easy to run to regularly follow the upstream without the need to always create new patches: the latter only creates unnecessary work for OQS and consequently reduces the motivation for keeping the code up-to-date. Of course, if there is no further development expected in PQCP (is it?) this point is moot.
Copy-from-upstream option to preserve folder stucture Smaller patch: no include paths fixing & meta-ymls available upstream Documenting ct-passes file Update dependencies for CBOM [full tests] [extended tests] Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
cb07260
to
d199bb3
Compare
xof_x4_ctx statex; | ||
unsigned int buflen; | ||
|
||
+ shake128x4_inc_init(&statex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm -- is the upstream API for shake so much different that it doesn't need initialization? Or is this a real functional patch that should be upstreamed? Anyway, thanks for streamlining the patch overall @bhess!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the upstream API for absorb
is one-shot and includes initialization.
But still, @bhess could this be addressed in the fips202x4.h
glue code instead? That might be a more natural place to bridge between one-shot and incremental API than a patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanno-becker - I'll review that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But still, @bhess could this be addressed in the
fips202x4.h
glue code instead? That might be a more natural place to bridge between one-shot and incremental API than a patch.
Our absorb function is also one-shot in the sense that it resets the context, performs absorb, and finalizes the context. However, we may want to reuse the context for another operation later.
The difference in our implementation compared to yours is that we allocate the structure on the heap. In this case, we explicitly perform this allocation once, before running absorb and squeeze.
Would it be possible to add this initialization step to the upstream code as well? For your implementation, it would effectively be a nop, while for us, it could handle the allocation without requiring a patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhess Can you elaborate a bit further why this cannot be done in the glue code? It is expected that the FIPS202 API used by mlkem-native may not exactly match the one of a consuming application bringing its own FIPS202, but in that case, the expectation is that fips202[x4].h
contains shims mapping the (more specialized) mlkem-native calls to the (likely more general) underlying FIPS202 implementation.
We added examples/bring_your_own_fips202 which exemplifies this in the example of tiny_sha3. tiny_sha3's API is also not exactly what's needed for mlkem-native, and there's glue code mapping between them.
I'm not in principle opposed to making further changes, but would like to understand better why the API-difference cannot be handled like this in your case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SWilson4 Thank you for chiming in! How would a failure be raised, though? One would need to bubble it up the call chain to the top-level API, and change all function signatures along the way. For uniformity, one would also need to change all other places where hashing is used. Is that what you had in mind? I'm hesitant about such change on first thought, but will chat with Matthias. This is not specific to and not a requirement for this PR though, right? Is there already an issue / feature request?
@bhess As I understand, the only issue here is that at present a single shim is used for multiple implementations? I didn't expect that. WIth a shim merely for mlkem-native, you could just allocate in the shim for shake128_absorb_once
function. This would not affect the ability to do repeated absorb+squeeze
in other contexts. Could you confirm that understanding?
That said, we already have an explicit xxx_release()
function anyway, so it only makes the API more symmetric to also offer an init
. I opened pq-code-package/mlkem-native#686. Update: This is now merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only issue here is that at present a single shim is used for multiple implementations? I didn't expect that.
Can I please ask you to do this going forward then, @hanno-becker ? IMO this is exactly what OQS is and does at its core: A mechanism to bundle together all algorithms in the same way, both "from above" (for others to use) as well as "from below" (providing the same common code for all algorithms). Changing that (doing specific per-algorithm changes) voids the purpose of OQS or at least, renders it unmaintainable as the small team cannot possibly be specialists in all constituent algorithms "below" as well as support all kinds of user wishes "from above". Or phrased differently: It is absolutely clear that algorithm-specific integrations are much easier to maintain and more performant (as is being documented publicly right now by openssl
and less publicly by the many proprietary PQC implementations springing up), but that's not what OQS is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, we already have an explicit
xxx_release()
function anyway, so it only makes the API more symmetric to also offer aninit
. I opened pq-code-package/mlkem-native#686. Update: This is now merged.
Great, thank you very much!
As an add-on to this, it would be really nice if there were an option in mlkem-native to error-check SHA3 / RNG code. Our strategy in liboqs right now is to do a hard exit on a malloc failure, and it would be great if we could instead check a return value and exit gracefully.
Fully agree with that catching these errors gracefully would be desireable ! However, I believe error handling with randombytes would be a concern in the future (#1750), as our current API doesn’t support this.
One example is the MAYO implementation, which supports a compile flag, HAVE_RANDOMBYTES_NORETVAL
. This flag allows switching between randombytes implementations that either return a code or don’t. Perhaps adopting a similar approach could be beneficial for mlkem-native as well.
liboqs/src/sig/mayo/pqmayo_mayo-5_opt/mayo.c
Lines 320 to 326 in 64bceb3
#if defined(PQM4) || defined(HAVE_RANDOMBYTES_NORETVAL) | |
randombytes(tmp + param_digest_bytes, param_salt_bytes); | |
#else | |
if (randombytes(tmp + param_digest_bytes, param_salt_bytes) != MAYO_OK) { | |
ret = MAYO_ERR; | |
goto err; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve updated the PR to incorporate changes from upstream (pq-code-package/mlkem-native#686).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SWilson4 Thank you for chiming in! How would a failure be raised, though? One would need to bubble it up the call chain to the top-level API, and change all function signatures along the way. For uniformity, one would also need to change all other places where hashing is used. Is that what you had in mind? I'm hesitant about such change on first thought, but will chat with Matthias. This is not specific to and not a requirement for this PR though, right? Is there already an issue / feature request?
Nope, not a requirement for this PR, but IMO it would be a major improvement for liboqs. A couple of related issues in the backlog are #1456 and #1750.
Thanks for the review @baentsch. The patch size is now much reduced, basically only to adapt a few things to be able to use our fips202/sha3 implementation. For the upstream implementation it seems not straight-forward to move away from relative import paths. However, this is no longer an issue because I’ve added an option to copy_from_upstream that preserves the upstream folder structure. As a result, no further patching is required. |
Comments addressed. Discussion ongoing. Don't want to hinder other approvals moving things forward.
…ended tests] Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't attempted to review the code imported from PQCP (and I wouldn't have the expertise to do so anyhow), but the integration-related code looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's time to rename this file to "upstream_shims" or something similar to reflect the fact that it's no longer exclusive to PQClean?
The following measurements are on an Intel Xeon Gold 6338 CPU @ 2.00GHz, Turbo Boost turned off for consistent results:
1.1. Old implementation from main
1.2 mlkem-native implementation
-> We see a nice speedup in the generic code
2.1 Old implementation from main:
2.2 mlkem-native implementation
-> The key generation performance is very similar, but there's some performance degradation in encapsulation/decapsulation. This can likely be attributed to the additional key checks implemented in mlkem-native to meet FIPS203 requirements, which are more noticeable in the otherwise optimized code. Feedback from @mkannwischer would be appreciated to confirm if this aligns with your expectations. |
Thanks for the benchmarks. |
This PR tracks the integration of ML-KEM from the mlkem-native upstream repository.
It replaces the current ML-KEM implementation in liboqs, which was previously imported from pq-crystals, with the mlkem-native implementation from PQCP.
Some features of mlkem-native:
The upstream code recently had a v1.0.0-alpha release and is actively maintained. The goal is to synchronize the PR with an upcoming tagged release of mlkem-native.
Additionally, the upstream code includes enhanced key validation as defined by FIPS 203 by default, which resolves issue #1951.
Closes #1951.
TODOs: