Skip to content

Commit

Permalink
blake256: Add _asm note about AVX2 attempts.
Browse files Browse the repository at this point in the history
  • Loading branch information
davecgh committed Jul 15, 2024
1 parent afedde2 commit 986d6e7
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions crypto/blake256/internal/_asm/gen_compress_asm_amd64.go
Original file line number Diff line number Diff line change
Expand Up @@ -1090,6 +1090,16 @@ func blocksAVX() {
}

func main() {
// NOTE: Various attempts to optimize using the larger 256-bit registers
// provided by AVX2 were made, but since only 4 columns can be computed in
// parallel, it turns out that the extra overhead of shuffling data around
// offsets any gains made by the few places that the larger registers are
// able to speed up. That includes things such as converting the message to
// big endian using 2x256-bit registers and freeing up registers by packing
// more data into the larger registers and then making use of the extra
// freed up registers to cache the results of xoring the message and
// constants to reuse in final rounds where they are the same.

build.ConstraintExpr("!purego")
globalData()
blocksSSE2()
Expand Down

0 comments on commit 986d6e7

Please sign in to comment.