Profile-Guided Optimization (PGO) benchmark report #754
zamazan4ik
started this conversation in
Ideas
Replies: 1 comment
-
Interesting, that's some fairly large differences. It'd be nice to test on a real workload instead of micro benchmarks (e.g. https://github.com/gimli-rs/addr2line/blob/master/scripts/benchmark-addr2line.sh) cargo-benchcmp comparison:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO helped a lot for many libraries, I decided to apply it to
gimli
to see if the performance win (or loss) can be achieved. Here are my benchmark results.This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.
Test environment
gimli
version:master
branch on commit7e9d923a98c5eeed4d7a8b8cb32475d1ce16ced2
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. Release bench results I got with
taskset -c 0 cargo +nightly bench
command. The PGO training phase is done withtaskset -c 0 cargo +nightly pgo bench
, PGO optimization phase - withtaskset -c 0 cargo +nightly pgo optimize bench
.taskset -c 0
is used for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).Results
I got the following results.
Release:
PGO optimized:
(just for reference) PGO instrumented:
According to the results, we see improvements in the library's performance.
Further steps
At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about
gimli
performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work.Also, Post-Link Optimization (PLO) can be tested after PGO. It can be done by applying tools like LLVM BOLT to applications with apps that use
gimli
. However, it's a much less mature optimization technique compared to PGO.Thank you.
Beta Was this translation helpful? Give feedback.
All reactions