Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile-Guided Optimization (PGO) benchmark report #3

Open
zamazan4ik opened this issue Oct 20, 2024 · 0 comments
Open

Profile-Guided Optimization (PGO) benchmark report #3

zamazan4ik opened this issue Oct 20, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@zamazan4ik
Copy link

Hi!

As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO helped a lot for many libraries and projects in the database domain, I decided to apply it to canopydb to see if the performance win (or lose) can be achieved. Here are my benchmark results.

This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.

Test environment

  • Fedora 40
  • Linux kernel 6.10.12
  • AMD Ryzen 9 5900x
  • 48 Gib RAM
  • SSD Samsung 980 Pro 2 Tib
  • Compiler - Rustc 1.82.0
  • canopydb version: used in the benchmark (see details below)
  • Disabled Turbo boost

Benchmark

For benchmark purposes, I use this benchmark method. For PGO optimization I use cargo-pgo tool. PGO training dataset for all benchmarks is the same - the benchmark itself (so three training datasets since I performed 3 benches). All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).

A note about the results for RocksDB. cargo-pgo doesn't by default apply PGO optimization to C++ dependencies. However, PGO results for RocksDB are also available in the "awesome-pgo" repo.

Results

Here we go - the results!

large_values_benchmark

Release:

+------------------------+---------+----------+---------+---------+---------+
|                        | redb    | canopydb | sled    | lmdb    | rocksdb |
+===========================================================================+
| bulk load (2MB values) | 40686ms | 36481ms  | 37416ms | 11240ms | 27688ms |
+------------------------+---------+----------+---------+---------+---------+

PGO optimized:

+------------------------+---------+----------+---------+---------+---------+
|                        | redb    | canopydb | sled    | lmdb    | rocksdb |
+===========================================================================+
| bulk load (2MB values) | 42336ms | 36634ms  | 38960ms | 11261ms | 27614ms |
+------------------------+---------+----------+---------+---------+---------+

(just for reference) PGO instrumented:

+------------------------+----------+----------+---------+---------+---------+
|                        | redb     | canopydb | sled    | lmdb    | rocksdb |
+============================================================================+
| bulk load (2MB values) | 118375ms | 36179ms  | 39517ms | 11978ms | 27685ms |
+------------------------+----------+----------+---------+---------+---------+

int_benchmark

Release:

+-----------+--------+----------+--------+-----------+-------+---------+
|           | redb   | canopydb | sled   | sanakirja | lmdb  | rocksdb |
+======================================================================+
| bulk load | 2007ms | 546ms    | 4144ms | 562ms     | 527ms | 4904ms  |
+-----------+--------+----------+--------+-----------+-------+---------+

PGO optimized:

+-----------+--------+----------+--------+-----------+-------+---------+
|           | redb   | canopydb | sled   | sanakirja | lmdb  | rocksdb |
+======================================================================+
| bulk load | 1537ms | 460ms    | 3653ms | 426ms     | 524ms | 4994ms  |
+-----------+--------+----------+--------+-----------+-------+---------+

(just for reference) PGO instrumented:

+-----------+--------+----------+--------+-----------+-------+---------+
|           | redb   | canopydb | sled   | sanakirja | lmdb  | rocksdb |
+======================================================================+
| bulk load | 4589ms | 817ms    | 5264ms | 607ms     | 527ms | 4923ms  |
+-----------+--------+----------+--------+-----------+-------+---------+

lmdb_benchmark

Release:

|                           | redb        | canopydb       | sled         | sanakirja   | lmdb         | rocksdb        |
|---------------------------|-------------|----------------|--------------|-------------|--------------|----------------|
| bulk load                 | *2.87s*     | 8.43s          | 6.12s        | 4.37s       | **1.29s**    | 6.71s          |
| individual writes         | 11.95s      | 9.32s          | *6.52s*      | 13.59s      | 12.73s       | **6.28s**      |
| batch writes              | 7.01s       | *1.36s*        | 1.53s        | 10.12s      | 7.18s        | **959.84ms**   |
| len()                     | 3.41µs      | *1.57µs*       | 519.55ms     | 65.47ms     | **201.00ns** | 269.06ms       |
| random reads              | 1.19s       | *780.65ms*     | 1.81s        | 1.20s       | **739.24ms** | 2.82s          |
| random reads              | 1.08s       | ***728.52ms*** | 1.82s        | 1.19s       | 732.14ms     | 2.82s          |
| random range reads        | 2.99s       | 2.24s          | 6.20s        | *1.81s*     | **1.46s**    | 5.68s          |
| random range reads        | 2.94s       | 2.25s          | 6.17s        | *1.81s*     | **1.47s**    | 5.67s          |
| random reads (4 threads)  | 820.48ms    | *565.46ms*     | 1.12s        | 1.06s       | **377.26ms** | 1.58s          |
| random reads (8 threads)  | 455.59ms    | *306.07ms*     | 615.56ms     | 2.26s       | **195.99ms** | 819.14ms       |
| random reads (16 threads) | 261.04ms    | *185.63ms*     | 371.16ms     | 7.11s       | **116.83ms** | 556.70ms       |
| random reads (32 threads) | 231.43ms    | *167.44ms*     | 316.97ms     | 11.30s      | **105.13ms** | 400.28ms       |
| bulk removals             | 2.27s       | 6.80s          | 2.79s        | *2.18s*     | **1.07s**    | 3.34s          |
| size pre-compact          | 771.51 MiB  | 647.41 MiB     | *455.50 MiB* | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |
| compaction                | ***1.95s*** | 2.71s          | N/A          | N/A         | N/A          | N/A            |
| size after bench          | 341.20 MiB  | *303.11 MiB*   | 455.50 MiB   | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |

PGO optimized:

|                           | redb        | canopydb       | sled         | sanakirja   | lmdb         | rocksdb        |
|---------------------------|-------------|----------------|--------------|-------------|--------------|----------------|
| bulk load                 | *2.53s*     | 8.29s          | 5.70s        | 4.25s       | **1.30s**    | 6.85s          |
| individual writes         | 6.82s       | 6.55s          | *6.49s*      | 13.28s      | 13.04s       | **6.30s**      |
| batch writes              | 6.74s       | *1.25s*        | 1.44s        | 9.84s       | 7.31s        | **1.00s**      |
| len()                     | 2.70µs      | *1.37µs*       | 448.11ms     | 51.64ms     | **781.00ns** | 264.90ms       |
| random reads              | 989.37ms    | *728.04ms*     | 1.65s        | 1.18s       | **720.88ms** | 2.78s          |
| random reads              | 937.73ms    | ***687.52ms*** | 1.64s        | 1.16s       | 743.17ms     | 2.76s          |
| random range reads        | 2.52s       | 1.76s          | 5.37s        | *1.60s*     | **1.41s**    | 5.61s          |
| random range reads        | 2.50s       | 1.76s          | 5.33s        | *1.61s*     | **1.42s**    | 5.64s          |
| random reads (4 threads)  | 718.01ms    | *507.22ms*     | 1.03s        | 958.28ms    | **363.51ms** | 1.55s          |
| random reads (8 threads)  | 377.45ms    | *278.90ms*     | 549.95ms     | 2.29s       | **186.32ms** | 816.05ms       |
| random reads (16 threads) | 234.57ms    | *177.09ms*     | 357.89ms     | 6.95s       | **116.86ms** | 539.05ms       |
| random reads (32 threads) | 200.31ms    | *155.16ms*     | 275.21ms     | 11.41s      | **93.51ms**  | 396.12ms       |
| bulk removals             | *2.08s*     | 6.33s          | 2.68s        | 2.18s       | **1.04s**    | 3.31s          |
| size pre-compact          | 771.51 MiB  | 629.89 MiB     | *458.00 MiB* | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |
| compaction                | ***1.66s*** | 2.50s          | N/A          | N/A         | N/A          | N/A            |
| size after bench          | 341.20 MiB  | *303.13 MiB*   | 458.00 MiB   | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |

(just for reference) PGO instrumented:

|                           | redb        | canopydb     | sled         | sanakirja   | lmdb         | rocksdb        |
|---------------------------|-------------|--------------|--------------|-------------|--------------|----------------|
| bulk load                 | *3.94s*     | 9.09s        | 8.29s        | 4.54s       | **1.44s**    | 7.06s          |
| individual writes         | 10.95s      | 9.20s        | *6.49s*      | 13.18s      | 20.52s       | **6.28s**      |
| batch writes              | 7.46s       | *1.62s*      | 1.69s        | 15.39s      | 6.62s        | **1.03s**      |
| len()                     | 3.77µs      | *2.28µs*     | 746.49ms     | 78.17ms     | **371.00ns** | 280.94ms       |
| random reads              | 1.38s       | *1.03s*      | 2.78s        | 1.54s       | **876.71ms** | 3.03s          |
| random reads              | 1.34s       | *960.12ms*   | 2.81s        | 1.52s       | **877.80ms** | 3.02s          |
| random range reads        | 3.63s       | 2.74s        | 9.25s        | *2.25s*     | **1.70s**    | 6.19s          |
| random range reads        | 3.63s       | 2.73s        | 9.24s        | *2.28s*     | **1.71s**    | 6.21s          |
| random reads (4 threads)  | 6.03s       | *3.71s*      | 12.22s       | 10.71s      | **1.74s**    | 2.11s          |
| random reads (8 threads)  | 5.28s       | *3.84s*      | 10.19s       | 8.83s       | 2.64s        | **2.20s**      |
| random reads (16 threads) | 4.11s       | *2.91s*      | 7.45s        | 6.72s       | **2.23s**    | 2.32s          |
| random reads (32 threads) | 3.36s       | *2.32s*      | 5.92s        | 14.49s      | **1.76s**    | 1.88s          |
| bulk removals             | 2.92s       | 7.08s        | 3.68s        | *2.57s*     | **1.13s**    | 3.45s          |
| size pre-compact          | 771.51 MiB  | 645.54 MiB   | *449.50 MiB* | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |
| compaction                | ***1.52s*** | 2.91s        | N/A          | N/A         | N/A          | N/A            |
| size after bench          | 341.20 MiB  | *302.92 MiB* | 449.50 MiB   | 1020.00 MiB | 583.26 MiB   | **207.81 MiB** |

According to the results in the benchmarks above, we see performance improvements in many cases (not only for canopydb btw!).

Further steps

At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about canopydb performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work.

Also, Post-Link Optimization (PLO) can be tested after PGO. It can be done by applying tools like LLVM BOLT to applications with apps that use canopydb. However, it's a much less mature optimization technique compared to PGO.

Thank you.

P.S. It's just a benchmark report, not a bug. Probably Discussions is a better place to put such things but they are disabled for the repo for now.

@arthurprs arthurprs added the documentation Improvements or additions to documentation label Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants