Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SHA* intrinsics on Intel CPU #37

Merged
merged 2 commits into from
Jan 4, 2019

Conversation

harshavardhana
Copy link
Member

No description provided.

- optimise: select block function at init
- added dedicated padding function, optimised endian conversion
- add assembly for Intel SHA extensions
- update benchmarks
- stream line checksum function
- cleanup of sha assembly code
@harshavardhana harshavardhana force-pushed the sha256-support branch 3 times, most recently from 49980ad to 771b9fb Compare January 4, 2019 22:28
Copy link
Contributor

@fwessels fwessels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after review

@fwessels fwessels merged commit e529fa1 into minio:master Jan 4, 2019
@AudriusButkevicius
Copy link

Any idea on performance gains?

@harshavardhana
Copy link
Member Author

Not yet we are still looking for a CPU to test..

@harshavardhana harshavardhana deleted the sha256-support branch January 4, 2019 23:57
@AudriusButkevicius
Copy link

AudriusButkevicius commented Jan 5, 2019

On my Ryzen 7 with syncthings benchmark.
Before:

[2ERMY] 00:06:04 INFO: Single thread SHA256 performance is 501 MB/s using minio/sha256-simd (444 MB/s using crypto/sha256).

After:

[2ERMY] 00:09:50 INFO: Single thread SHA256 performance is 1888 MB/s using minio/sha256-simd (444 MB/s using crypto/sha256).

@harshavardhana
Copy link
Member Author

harshavardhana commented Jan 5, 2019

Oh wow thanks - would you be interested in submitting the go test -bench . numbers ?

@AudriusButkevicius
Copy link

Gimme a sec

@AudriusButkevicius
Copy link

/cygdrive/c/Gohome/src/github.com/minio/sha256-simd ((HEAD detached at origin/master)) $ go test -bench .
goos: windows
goarch: amd64
pkg: github.com/minio/sha256-simd
BenchmarkHash/SHA_/8Bytes-16            20000000                68.8 ns/op       116.25 MB/s
BenchmarkHash/SHA_/1K-16                 3000000               587 ns/op        1741.83 MB/s
BenchmarkHash/SHA_/8K-16                  300000              4203 ns/op        1948.92 MB/s
BenchmarkHash/SHA_/1M-16                    3000            529852 ns/op        1979.00 MB/s
BenchmarkHash/SHA_/5M-16                     500           2654400 ns/op        1975.17 MB/s
BenchmarkHash/SHA_/10M-16                    300           5317723 ns/op        1971.85 MB/s
BenchmarkHash/AVX2/8Bytes-16            10000000               181 ns/op          43.98 MB/s
BenchmarkHash/AVX2/1K-16                 1000000              2210 ns/op         463.33 MB/s
BenchmarkHash/AVX2/8K-16                  100000             16194 ns/op         505.84 MB/s
BenchmarkHash/AVX2/1M-16                    1000           2038730 ns/op         514.33 MB/s
BenchmarkHash/AVX2/5M-16                     100          10192225 ns/op         514.40 MB/s
BenchmarkHash/AVX2/10M-16                    100          20547260 ns/op         510.32 MB/s
BenchmarkHash/AVX_/8Bytes-16            10000000               202 ns/op          39.47 MB/s
BenchmarkHash/AVX_/1K-16                  500000              2768 ns/op         369.86 MB/s
BenchmarkHash/AVX_/8K-16                  100000             20747 ns/op         394.84 MB/s
BenchmarkHash/AVX_/1M-16                     500           2606417 ns/op         402.31 MB/s
BenchmarkHash/AVX_/5M-16                     100          13104610 ns/op         400.08 MB/s
BenchmarkHash/AVX_/10M-16                     50          26053106 ns/op         402.48 MB/s
BenchmarkHash/SSSE/8Bytes-16            10000000               201 ns/op          39.63 MB/s
BenchmarkHash/SSSE/1K-16                  500000              2778 ns/op         368.52 MB/s
BenchmarkHash/SSSE/8K-16                  100000             20739 ns/op         395.00 MB/s
BenchmarkHash/SSSE/1M-16                     500           2611189 ns/op         401.57 MB/s
BenchmarkHash/SSSE/5M-16                     100          13064013 ns/op         401.32 MB/s
BenchmarkHash/SSSE/10M-16                     50          26194100 ns/op         400.31 MB/s
BenchmarkHash/GEN_/8Bytes-16            10000000               202 ns/op          39.54 MB/s
BenchmarkHash/GEN_/1K-16                  500000              2392 ns/op         428.05 MB/s
BenchmarkHash/GEN_/8K-16                  100000             17677 ns/op         463.42 MB/s
BenchmarkHash/GEN_/1M-16                    1000           2235606 ns/op         469.03 MB/s
BenchmarkHash/GEN_/5M-16                     100          11186153 ns/op         468.69 MB/s
BenchmarkHash/GEN_/10M-16                    100          22413662 ns/op         467.83 MB/s
PASS
ok      github.com/minio/sha256-simd    53.596s

@harshavardhana
Copy link
Member Author

Thanks looks quite promising.

@harshavardhana
Copy link
Member Author

@AudriusButkevicius can you provide the cpuinfo for your processor would like to add this under

https://github.com/minio/sha256-simd#performance

Processor SIMD Speed (MB/s)
3.0 GHz Intel Xeon Platinum 8124M AVX512 3498
1.5 GHz AMD Ryzen SHA* 1888
1.2 GHz ARM Cortex-A53 ARM64 638
3.0 GHz Intel Xeon Platinum 8124M AVX2 449
3.1 GHz Intel Core i7 AVX 362
3.1 GHz Intel Core i7 SSE 299

@AudriusButkevicius
Copy link

AudriusButkevicius commented Jan 6, 2019

@svenski123
Copy link

@harshavardhana @fwessels
I am the author of the Intel SHA extensions based enhancement to sha256-simd which I inadvertently submitted in PR #36 (which was not merged).

The code in this PR #37 (which was merged) appears to be my work with minor changes.

While I am happy to license this code to your project under the Apache 2.0 open source license, I must insist on proper attribution and identication of copyright holders.

In particular, the assembly language source file sha256blockSha_amd64.s is my original work and I hold copyright in it. Please kindly correct the copyright notice that you have added to this file accordingly.

Please contact me directly on Gitter to discuss further.

@harshavardhana
Copy link
Member Author

@svenski123 let me know what is the copyright it can be definitely added.

kwi-dk added a commit to kwi-dk/password-generator that referenced this pull request Sep 6, 2019
These days, a more realistic number of guesses per second per CPU is
10,000, corresponding e.g. to 5,000 round SHA-256 (e.g. Linux/glibc
crypt) using dedicated CPU SHA-256 instructions.[1] Also, studies are
not clear on the benefits of passphrases with regard to retention.[2]

[1]: minio/sha256-simd#37 (comment)
[2]: https://cups.cs.cmu.edu/soups/2012/proceedings/a7_Shay.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants