Skip to content

Conversation

@elementrics
Copy link
Contributor

To test performance difference on arm64 chips: "go test -benchmem -run=^$ ./sign/internal/dilithium -bench=Le16"

On my machine (Apple M1 Max) on average:

BenchmarkPackLe16-10            69454038                17.62 ns/op            0 B/op          0 allocs/op
BenchmarkPackLe16Generic-10     18178948                66.44 ns/op            0 B/op          0 allocs/op

Also consider this are microbenchmarks!

Copy link
Member

@bwesterb bwesterb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits.

// length N/2.
func (p *Poly) PackLe16(buf []byte) {
p.packLe16Generic(buf)
// early bounds so we don't have to in assembly code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the check that much, but I dislike that you write we have to check it. I think it's optional. Most internal functions have a bunch of prerequisites, which aren't always easy to check. One prerequisite you don't check here is that the coefficients of p are indeed less than 16. That's fine: inspecting the call sites we see that it is indeed fine. Same for length of the buffer passed.

@elementrics
Copy link
Contributor Author

I wanted to throw in one thing. I'm coming from C etc. which means the ABI was based on which registers are caller and which are callee-saved.

Regarding Go I am not 100% sure what must be guaranteed. I've read somewhere that in Go are no callee-save registers. Is it still true or are there any caveats?

Based on my personal projects, this was always the assumption, also there were no issues.

@bwesterb
Copy link
Member

Regarding Go I am not 100% sure what must be guaranteed. I've read somewhere that in Go are no callee-save registers. Is it still true or are there any caveats?

There are caveats. They're documented here.

@bwesterb bwesterb merged commit 12bafce into cloudflare:main Aug 15, 2025
11 checks passed
@elementrics elementrics deleted the polyPackLe16 branch August 16, 2025 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants