-
Notifications
You must be signed in to change notification settings - Fork 181
ARM64 implementation for poly.PackLe16 #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bwesterb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small nits.
| // length N/2. | ||
| func (p *Poly) PackLe16(buf []byte) { | ||
| p.packLe16Generic(buf) | ||
| // early bounds so we don't have to in assembly code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind the check that much, but I dislike that you write we have to check it. I think it's optional. Most internal functions have a bunch of prerequisites, which aren't always easy to check. One prerequisite you don't check here is that the coefficients of p are indeed less than 16. That's fine: inspecting the call sites we see that it is indeed fine. Same for length of the buffer passed.
|
I wanted to throw in one thing. I'm coming from C etc. which means the ABI was based on which registers are caller and which are callee-saved. Regarding Go I am not 100% sure what must be guaranteed. I've read somewhere that in Go are no callee-save registers. Is it still true or are there any caveats? Based on my personal projects, this was always the assumption, also there were no issues. |
There are caveats. They're documented here. |
To test performance difference on arm64 chips: "go test -benchmem -run=^$ ./sign/internal/dilithium -bench=Le16"
On my machine (Apple M1 Max) on average:
Also consider this are microbenchmarks!