Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update blake3 to v1.5.1.bcr.1 #22017

Closed
wants to merge 5 commits into from
Closed

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Apr 16, 2024

This brings AVX-512 support on Linux.

Also adds a JMH benchmark pitting BLAKE3 against SHA2-256.

Results with -f 1 (single JVM fork) and for hashBytesOneShot only:

Intel Core i5-8520U, Linux: BLAKE3 has ~8x the throughput on large inputs
Benchmark                                      (size)    (type)   Mode  Cnt        Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5  3897193.109 ± 104089.759  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  9773250.840 ± 919565.969  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5  4058401.127 ±  69345.382  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  9338184.696 ± 575903.627  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5  3883335.405 ± 197131.021  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  3931746.804 ± 111963.068  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5  3165886.130 ± 105001.405  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5  1689377.092 ±  67006.025  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5  2137151.012 ±  71425.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   971335.403 ±  43622.796  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5  1266551.855 ±  77312.865  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   271217.035 ±  15770.310  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5   562124.458 ±  47243.736  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    72281.652 ±  10734.186  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5     9800.524 ±    230.269  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     1124.542 ±     40.938  ops/s
MacBook Pro with M3 Max, macOS: BLAKE3 has ~0.75x the throughput on large inputs
Benchmark                                      (size)    (type)   Mode  Cnt         Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   9262824.819 ±  12194.067  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  76557346.275 ± 548738.127  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   9254500.192 ±  22138.081  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  81029076.629 ± 748425.519  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   8304084.839 ±  20398.724  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  41460273.256 ± 106648.234  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   3092086.580 ±   1301.806  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   9355426.285 ±   7352.032  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1670833.346 ±   1809.726  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   2562509.914 ±  29303.110  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    484960.116 ±    146.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    658392.748 ±   3364.324  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      7987.472 ±     19.194  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     10380.444 ±      8.804  ops/s
AMD Ryzen 7 PRO 5850U, Windows: BLAKE3 has ~1.5x the throughput on large inputs
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   5569003,683 ± 125621,794  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  21202138,257 ± 458127,205  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   5539298,273 ±  77378,097  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  21618815,496 ± 208338,556  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   5047579,827 ± 118690,537  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  15806244,512 ± 258848,826  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5   3300538,392 ±  53754,778  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5   8353887,852 ±  47076,094  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   2062144,084 ±  14557,116  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   5120693,705 ±  30640,599  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1437595,889 ±  34088,637  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   1552307,356 ±  25584,819  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    558955,757 ±   8647,716  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    411619,868 ±   1179,203  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      9576,940 ±    460,875  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5      6470,682 ±     41,223  ops/s

@@ -1,102 +0,0 @@
load("@rules_license//rules:license.bzl", "license")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I split this out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to migrate to BCR in one PR. However, can you create another PR for the benchmark code? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the migration, this file was unused even before this PR. I thought that I might as well remove it, but I remember third_party changes being subject to a special merge procedure.

I removed this from the diff and will submit as a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmeum
Copy link
Collaborator Author

fmeum commented Apr 16, 2024

CC @sluongng @tylerwilliams Not sure what's going on here, but while BLAKE3 is much faster on Linux, it is much slower on macOS.

@fmeum fmeum changed the title Update blake3 to v1.5.1 Update blake3 to v1.5.1.bcr.1 Apr 16, 2024
@fmeum fmeum marked this pull request as ready for review April 16, 2024 15:37
@fmeum fmeum requested a review from meteorcloudy April 16, 2024 15:37
@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Apr 16, 2024
@iancha1992 iancha1992 added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Apr 16, 2024
@meteorcloudy
Copy link
Member

I'll let @coeuvre take a look since he helped add the original blake3 support.

@sluongng
Copy link
Contributor

There are some more context that Fabian shared in #22011

It would be nice if the benchmark used an exponential scale for the file size to have results from 1B -> 1GB. Some ecosystems, like iOS development, usually have to deal with larger artifacts regularly.

@fmeum
Copy link
Collaborator Author

fmeum commented Apr 17, 2024

I simplified the benchmark to just do one-shot hashing for now and also changed the benchmarked file sizes to {1 byte, 1 KB, 1 MB, 1GB}.

@coeuvre coeuvre added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Apr 18, 2024
@fmeum
Copy link
Collaborator Author

fmeum commented Apr 18, 2024

@bazel-io fork 7.2.0

@iancha1992
Copy link
Member

@fmeum There is no doc string for file java_opt_binary.bzl. Could you please create? Thanks!

@fmeum
Copy link
Collaborator Author

fmeum commented Apr 22, 2024

@fmeum There is no doc string for file java_opt_binary.bzl. Could you please create? Thanks!

Added

third_party/blake3/blake3.BUILD Outdated Show resolved Hide resolved
@iancha1992
Copy link
Member

@fmeum could you please take a look at the conflicts? Thank you!

@fmeum
Copy link
Collaborator Author

fmeum commented Apr 25, 2024

@iancha1992 I resolved them.

@fmeum fmeum deleted the blake-benchmark branch April 26, 2024 09:53
iancha1992 pushed a commit to iancha1992/bazel that referenced this pull request Apr 26, 2024
This brings AVX-512 support on Linux.

Also adds a JMH benchmark pitting BLAKE3 against SHA2-256.

Results with `-f 1` (single JVM fork) and for `hashBytesOneShot` only:
<details>
<summary>Intel Core i5-8520U, Linux: BLAKE3 has ~8x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt        Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5  3897193.109 ± 104089.759  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  9773250.840 ± 919565.969  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5  4058401.127 ±  69345.382  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  9338184.696 ± 575903.627  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5  3883335.405 ± 197131.021  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  3931746.804 ± 111963.068  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5  3165886.130 ± 105001.405  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5  1689377.092 ±  67006.025  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5  2137151.012 ±  71425.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   971335.403 ±  43622.796  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5  1266551.855 ±  77312.865  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   271217.035 ±  15770.310  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5   562124.458 ±  47243.736  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    72281.652 ±  10734.186  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5     9800.524 ±    230.269  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     1124.542 ±     40.938  ops/s
</pre>
</details>
<details>
<summary>MacBook Pro with M3 Max, macOS: BLAKE3 has ~0.75x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt         Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   9262824.819 ±  12194.067  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  76557346.275 ± 548738.127  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   9254500.192 ±  22138.081  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  81029076.629 ± 748425.519  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   8304084.839 ±  20398.724  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  4146027.256 ± 106648.234  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   3092086.580 ±   1301.806  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   9355426.285 ±   7352.032  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1670833.346 ±   1809.726  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   2562509.914 ±  29303.110  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    484960.116 ±    146.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    658392.748 ±   3364.324  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      7987.472 ±     19.194  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     10380.444 ±      8.804  ops/s
</pre>
</details>
<details>
<summary>AMD Ryzen 7 PRO 5850U, Windows: BLAKE3 has ~1.5x the throughput on large inputs</summary>
<pre>
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   5569003,683 ± 125621,794  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  21202138,257 ± 458127,205  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   5539298,273 ±  77378,097  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  21618815,496 ± 208338,556  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   5047579,827 ± 118690,537  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  15806244,512 ± 258848,826  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5   3300538,392 ±  53754,778  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5   8353887,852 ±  47076,094  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   2062144,084 ±  14557,116  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   5120693,705 ±  30640,599  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1437595,889 ±  34088,637  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   1552307,356 ±  25584,819  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    558955,757 ±   8647,716  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    411619,868 ±   1179,203  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      9576,940 ±    460,875  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5      6470,682 ±     41,223  ops/s
</pre>
</details>

Closes bazelbuild#22017.

PiperOrigin-RevId: 628330908
Change-Id: Ic635027d020d60b79d2e498fcebb0cc42fae712b
@sgowroji sgowroji removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Apr 30, 2024
Wyverald pushed a commit that referenced this pull request May 8, 2024
This brings AVX-512 support on Linux.

Also adds a JMH benchmark pitting BLAKE3 against SHA2-256.

Results with `-f 1` (single JVM fork) and for `hashBytesOneShot` only:
<details>
<summary>Intel Core i5-8520U, Linux: BLAKE3 has ~8x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt        Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5  3897193.109 ± 104089.759  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  9773250.840 ± 919565.969  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5  4058401.127 ±  69345.382  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  9338184.696 ± 575903.627  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5  3883335.405 ± 197131.021  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  3931746.804 ± 111963.068  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5  3165886.130 ± 105001.405  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5  1689377.092 ±  67006.025  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5  2137151.012 ±  71425.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   971335.403 ±  43622.796  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5  1266551.855 ±  77312.865  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   271217.035 ±  15770.310  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5   562124.458 ±  47243.736  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    72281.652 ±  10734.186  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5     9800.524 ±    230.269  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     1124.542 ±     40.938  ops/s
</pre>
</details>
<details>
<summary>MacBook Pro with M3 Max, macOS: BLAKE3 has ~0.75x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt         Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   9262824.819 ±  12194.067  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  76557346.275 ± 548738.127  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   9254500.192 ±  22138.081  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  81029076.629 ± 748425.519  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   8304084.839 ±  20398.724  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  4146027.256 ± 106648.234  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   3092086.580 ±   1301.806  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   9355426.285 ±   7352.032  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1670833.346 ±   1809.726  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   2562509.914 ±  29303.110  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    484960.116 ±    146.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    658392.748 ±   3364.324  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      7987.472 ±     19.194  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     10380.444 ±      8.804  ops/s
</pre>
</details>
<details>
<summary>AMD Ryzen 7 PRO 5850U, Windows: BLAKE3 has ~1.5x the throughput on large inputs</summary>
<pre>
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   5569003,683 ± 125621,794  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  21202138,257 ± 458127,205  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   5539298,273 ±  77378,097  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  21618815,496 ± 208338,556  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   5047579,827 ± 118690,537  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  15806244,512 ± 258848,826  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5   3300538,392 ±  53754,778  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5   8353887,852 ±  47076,094  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   2062144,084 ±  14557,116  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   5120693,705 ±  30640,599  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1437595,889 ±  34088,637  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   1552307,356 ±  25584,819  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    558955,757 ±   8647,716  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    411619,868 ±   1179,203  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      9576,940 ±    460,875  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5      6470,682 ±     41,223  ops/s
</pre>
</details>

Closes #22017.

PiperOrigin-RevId: 628330908
Change-Id: Ic635027d020d60b79d2e498fcebb0cc42fae712b
github-merge-queue bot pushed a commit that referenced this pull request May 9, 2024
Original PRs/commits:

* #22017
* #22213
*
81117aa

---------

Co-authored-by: Mark Elliot <123787712+mark-thm@users.noreply.github.com>
Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
Kila2 pushed a commit to Kila2/bazel that referenced this pull request May 13, 2024
This brings AVX-512 support on Linux.

Also adds a JMH benchmark pitting BLAKE3 against SHA2-256.

Results with `-f 1` (single JVM fork) and for `hashBytesOneShot` only:
<details>
<summary>Intel Core i5-8520U, Linux: BLAKE3 has ~8x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt        Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5  3897193.109 ± 104089.759  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  9773250.840 ± 919565.969  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5  4058401.127 ±  69345.382  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  9338184.696 ± 575903.627  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5  3883335.405 ± 197131.021  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  3931746.804 ± 111963.068  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5  3165886.130 ± 105001.405  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5  1689377.092 ±  67006.025  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5  2137151.012 ±  71425.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   971335.403 ±  43622.796  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5  1266551.855 ±  77312.865  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   271217.035 ±  15770.310  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5   562124.458 ±  47243.736  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    72281.652 ±  10734.186  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5     9800.524 ±    230.269  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     1124.542 ±     40.938  ops/s
</pre>
</details>
<details>
<summary>MacBook Pro with M3 Max, macOS: BLAKE3 has ~0.75x the throughput on large inputs</summary>
<pre>
Benchmark                                      (size)    (type)   Mode  Cnt         Score        Error  Units
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   9262824.819 ±  12194.067  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  76557346.275 ± 548738.127  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   9254500.192 ±  22138.081  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  81029076.629 ± 748425.519  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   8304084.839 ±  20398.724  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  4146027.256 ± 106648.234  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   3092086.580 ±   1301.806  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   9355426.285 ±   7352.032  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1670833.346 ±   1809.726  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   2562509.914 ±  29303.110  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    484960.116 ±    146.961  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    658392.748 ±   3364.324  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      7987.472 ±     19.194  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5     10380.444 ±      8.804  ops/s
</pre>
</details>
<details>
<summary>AMD Ryzen 7 PRO 5850U, Windows: BLAKE3 has ~1.5x the throughput on large inputs</summary>
<pre>
BazelHashFunctionsBenchmark.hashBytesOneShot        1    BLAKE3  thrpt    5   5569003,683 ± 125621,794  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot        1  SHA2_256  thrpt    5  21202138,257 ± 458127,205  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16    BLAKE3  thrpt    5   5539298,273 ±  77378,097  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot       16  SHA2_256  thrpt    5  21618815,496 ± 208338,556  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128    BLAKE3  thrpt    5   5047579,827 ± 118690,537  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      128  SHA2_256  thrpt    5  15806244,512 ± 258848,826  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512    BLAKE3  thrpt    5   3300538,392 ±  53754,778  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot      512  SHA2_256  thrpt    5   8353887,852 ±  47076,094  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024    BLAKE3  thrpt    5   2062144,084 ±  14557,116  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     1024  SHA2_256  thrpt    5   5120693,705 ±  30640,599  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096    BLAKE3  thrpt    5   1437595,889 ±  34088,637  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot     4096  SHA2_256  thrpt    5   1552307,356 ±  25584,819  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384    BLAKE3  thrpt    5    558955,757 ±   8647,716  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot    16384  SHA2_256  thrpt    5    411619,868 ±   1179,203  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576    BLAKE3  thrpt    5      9576,940 ±    460,875  ops/s
BazelHashFunctionsBenchmark.hashBytesOneShot  1048576  SHA2_256  thrpt    5      6470,682 ±     41,223  ops/s
</pre>
</details>

Closes bazelbuild#22017.

PiperOrigin-RevId: 628330908
Change-Id: Ic635027d020d60b79d2e498fcebb0cc42fae712b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants