Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance Improvement] Poseidon with AVX2 (x86-64) #1621

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dloghin
Copy link

@dloghin dloghin commented Aug 30, 2024

This is a performance improvement PR targeting x86-64 systems with AVX2 support.

  1. target bottleneck
    Currently, Plonky2 does not have a working AVX2 Poseidon hashing implementation. AVX2 has the potential to improve the performance of Poseidon hashing by executing up to 4 64-bit Goldilocks operations at the same time.

  2. how performance is measured
    We use current benchmarks in plonky2/benches:

cargo bench --bench=hashing
cargo bench --bench=merkle

With AVX2 support, we run:

RUSTFLAGS="-C target-feature=+avx2" cargo bench --bench=hashing
RUSTFLAGS="-C target-feature=+avx2" cargo bench --bench=merkle
  1. characteristics of the machine used (CPU, OS, #threads if appropriate)
    We checked the performance on a c7i.2xlarge AWS instance with AMD CPU with the following specs:

CPU: AMD EPYC 9R14 (16 vCPU)
RAM: 16 GB
OS: Ubuntu 24.04
Compilers: GCC 13
Rust: rustc 1.82.0-nightly

  1. performance before and after the PR
    Performance improvement is 10-12%. For example, the output of RUSTFLAGS="-C target-feature=+avx2" cargo bench --bench=merkle is:
merkle-tree<GoldilocksField, PoseidonHash>/8192
                        time:   [25.929 ms 25.984 ms 26.017 ms]
                        change: [-11.076% -10.767% -10.460%] (p = 0.00 < 0.05)
                        Performance has improved.
merkle-tree<GoldilocksField, PoseidonHash>/16384
                        time:   [53.447 ms 53.596 ms 53.724 ms]
                        change: [-10.998% -10.679% -10.362%] (p = 0.00 < 0.05)
                        Performance has improved.
merkle-tree<GoldilocksField, PoseidonHash>/32768
                        time:   [107.41 ms 107.64 ms 107.83 ms]
                        change: [-10.545% -10.229% -9.8422%] (p = 0.00 < 0.05)
                        Performance has improved.

@Nashtare Nashtare added this to the Performance Tuning milestone Aug 30, 2024
@Nashtare Nashtare added the optimization Performance related changes label Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Performance related changes
Projects
Status: Ready to Review
Development

Successfully merging this pull request may close these issues.

2 participants