-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argon2: add workaround for big 64-byte aligned allocations? #573
Comments
Small correction: the issue is with allocations using alignment greater than 16 (or, more generally, more than the maximum alignment supported by In particular, the case we care about is allocating |
Oh, you are right. Fixed. BTW do we really need the 64 byte alignment in the first place? IIUC this alignment is too strict for SIMD vectors and it looks like an optimization which accounts for cache line size. |
Yes, (I think) it's more about cache line size and, specifically, preventing false sharing. It gives a ~5% improvement over 16-byte alignment, if I recall correctly. I also tried 128-byte alignment, which in theory makes sense for modern 64-bit architectures (including x86-64), and is the value adopted by most general solutions for false sharing (e.g. It probably also matters that in Argon2 we can't false share any block, just blocks on the boundaries of the slices. False sharing here isn't as much of an issue as it can be in other cases. |
Have you tried to directly |
No, I haven't. |
It was previously discussed in #566.
The claim is that for big allocations we can get higher performance by allocating
len + 64
bytes with 1-byte alignment (we then can manually construct a 64-byte aligned region in the allocated memory with sizelen
), than by directly allocatinglen
bytes with 64-byte alignment.cc @jonasmalacofilho
The text was updated successfully, but these errors were encountered: