Tasks are defined in tasks/*.py
. The Conda environments, which specify the different BLAS configurations, are defined in results/*/environment.yml
. Python versions and numbers of threads are defined in profiles.yml
.
To determine the runtime of a task, each task is repeated for at least 10 seconds, and the average is determined. The repetition and averaging procedure is repeated 3 times, and the best result is used.
The configurations mkl2020.0_debug
and mkl2020.1_fakeintel
perform overall best:
AMD Ryzen Threadripper 3970X | AMD EPYC 7763 | ||
---|---|---|---|
2 threads | 16 threads | 2 threads | |
openblas |
1.003829 | 1.019895 | 1.016262 |
mkl2024.0 |
1.128423 | 1.213864 | 1.065984 |
mkl2020.0_debug |
1.156737 | 1.261223 | 1.162273 |
mkl2020.1_fakeintel |
1.144065 | 1.281782 | 1.164156 |
The score of a configuration is the geometric mean of the best possible speed-up in comparison to the other configurations. See reports/*.ipynb
for details.
Run the benchmark on your CPU:
python -m benchmark.cli --profiles py38_2threads py38_16threads --run
Or only update the reports:
python -m benchmark.cli