You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following some guidelines of the Rust Performance Book here are some things we can try to improve performance:
Add codegen-units = 1 to release build
Use a faster allocator. E.g. mimalloc works on all operating systems
Not so easy:
properly profile to identify hot parts
remove clones/allocations where not needed
use profile-guided optimization (e.g. via cargo-pgo)
unfortunately this is currently not working with LTO and the PGO version is 10-20% slower than LTO
might be available in the future in maturin directly, see here
Quick tests with codegen-units = 1 added to release-lto (see here) show performance improvements of benchmarks of up to 12% (mean is about 7%) while for dual_number, changes are a bit smaller (see below).
Proper benchmarks (across all benchmarks) with comparison to current release workflow are needed but this might be an easy-to-get improvement if it turns out to be faster for all cases.
Benchmark: dual_numbers
System: methane/CO2
main: main branch + lto
main_codegen: main branch + lto + codegen-units = 1
develop_: like main
Execution times in µs
name
f64
dual
dual2
hyperdual
dual3
main
1.1382
1.2325
1.4539
1.6267
1.7563
main_codegen
1.0229
1.1741
1.3708
1.5777
1.6316
develop
1.0138
1.1989
1.4465
1.589
1.7549
develop_codegen
0.9761
1.1681
1.4195
1.5446
1.6304
Slowdown t_f64/t_d for each branch/option
f64
dual
dual2
hyperdual
dual3
main
1
1.08285
1.27737
1.42919
1.54305
main_codegen
1
1.14782
1.34011
1.54238
1.59507
develop
1
1.18258
1.42681
1.56737
1.73101
develop_codegen
1
1.1967
1.45426
1.58242
1.67032
Relative difference in % w.r.t. main + lto for each dual number (t_d_branch - t_d_main) / t_d_main * 100
name
f64
dual
dual2
hyperdual
dual3
main_codegen
-10.13
-4.74
-5.72
-3.01
-7.10
develop
-10.93
-2.73
-0.51
-2.32
-0.08
develop_codegen
-14.24
-5.23
-2.37
-5.05
-7.17
The text was updated successfully, but these errors were encountered:
Following some guidelines of the Rust Performance Book here are some things we can try to improve performance:
codegen-units = 1
to release buildNot so easy:
maturin
directly, see hereQuick tests with
codegen-units = 1
added torelease-lto
(see here) show performance improvements of benchmarks of up to 12% (mean is about 7%) while fordual_number
, changes are a bit smaller (see below).Proper benchmarks (across all benchmarks) with comparison to current release workflow are needed but this might be an easy-to-get improvement if it turns out to be faster for all cases.
main
: main branch + ltomain_codegen
: main branch + lto + codegen-units = 1develop_
: like mainExecution times in µs
Slowdown t_f64/t_d for each branch/option
Relative difference in % w.r.t. main + lto for each dual number (t_d_branch - t_d_main) / t_d_main * 100
The text was updated successfully, but these errors were encountered: