-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scalar/Neon/Neon hybrid for Keccak-x4 #179
Conversation
a91679f
to
56b6a69
Compare
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
28f7d6f
to
268be64
Compare
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
268be64
to
9117660
Compare
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
0660ab7
to
ede282e
Compare
I haven't yet amended I'd prefer to do this in a follow-up, but could be convinced otherwise. |
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
The scope of MLKEM_USE_NTT_ASM_CLEAN already went beyond the NTT, and will soon be further expanded to cover the choice of Keccak implementation. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanno-becker!!
It's getting rather hard to keep track of which implementation is used for each target. #141 should be addressed soon.
Code looks mostly good to me - some comments.
Here is the performance of keccak-f1600-x4 cycles
platform | before | after |
---|---|---|
A72 | 6196 | 5407 |
A76 | 3708 | 2607 |
A55 | 6012 | |
G2 | 3709 | 2608 |
G3 | 2078 | 1604 |
That looks great except for the A55 - there, you just want to use a scalar 1x implementation throughout. That's in line with the paper.
We'll need a special case for the A55 I guess. Probably better to do that as a follow-up PR.
Thanks for introducing the profile for the special case A55.
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
304bae9
to
3226337
Compare
Thanks! Now A55 performance looks good again, too. |
This PR adds hybrid scalar/Neon/Neon implementations of Keccak-x4. Those implementations were first described in https://eprint.iacr.org/2022/1243.
We don't consume the implementations of that paper as-is, however, but auto-generate the code from clean de-interleaved assembly. As it stands, only the interleaved versions are added here.