Evaluate using more optimizations: LTO, PGO, PLO #274
zamazan4ik
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
I want to discuss several compiler optimizations that could be helpful for the project. Maybe you even already tested them and have some benchmark numbers to share publicly ;)
At first, did you test Link-Time Optimization (LTO) for
grex
? If yes, do you have any numbers regarding performance and the resulting binary size? It would be nice to test since enabling LTO is not too time-consuming thing (usually, huh). LTO can help with achieving better performance and the binary size highly-likely will be smaller.I suggest enabling LTO only for the Release builds so as not to sacrifice the developers' experience while working on the project since LTO consumes an additional amount of time to finish the compilation routine. If you think that a regular Release build should not be affected by such a change as well, then I suggest adding an additional
release-lto
(ordist
) profile where additionally to regularrelease
optimizations LTO also will be added. Such a change simplifies life for maintainers and others interested in the project persons who want to build the most performant version of the application. Using ThinLTO also should help). Another benefit - users who install the tool withcargo install
will get LTO-optimizedgrex
"automatically".If we are talking about more advanced things, recently I started evaluating using Profile-Guided Optimization (PGO) for various applications and workloads (including compilers, parsers, databases, and many others) - all the results are available at https://github.com/zamazan4ik/awesome-pgo . As I have done many times before with other applications, I decided to test the PGO technique to optimize the project performance. Since PGO helped a lot with many other tools, I decided to apply it on
grex
to see if the performance win (or lose) can be achieved. Here are my benchmark results.Test environment
grex
version:main
branch on commit7fe879d5e02927a0d310eec90dc7fae11c458e12
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. I got the release bench results with the
taskset -c 0 cargo bench
command. The PGO training phase is done withtaskset -c 0 cargo pgo bench
, PGO optimization phase - withtaskset -c 0 cargo pgo optimize bench
.taskset -c 0
is used for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).Results
I got the following results:
According to the results, PGO measurably improves the project's performance a lot.
Further steps
Post-Link Optimization (PLO) is similar to the PGO technique but uses a slightly different approach. You can take a look at LLVM BOLT for more information. However, I recommend starting playing with PLO only after applying PGO - PGO is a much more stable and time-proven technology compared to PLO tools.
It would be nice to integrate all the optimizations above into the project to deliver to users actually "blazing fast" experience ;) I would be happy to answer all your questions about the optimizations above.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions