Skip to content

Performance Benchmarks

mrambacher edited this page Aug 12, 2020 · 75 revisions

These benchmarks measure RocksDB performance when data resides on flash storage. (The benchmarks on this page were generated in June 2020 with RocksDB 6.10.0 unless otherwise noted)

Setup

All of the benchmarks are run on the same AWS instance. Here are the details of the test setup:

  • Instance type: m5d.2xlarge 8 CPU, 32 GB Memory, 1 x 300 NVMe SSD.
  • Kernel version: Linux 4.14.177-139.253.amzn2.x86_64
  • File System: XFS with discard enabled

To understand the performance of the SSD card, we ran an fio test and observed 117K IOPS of 4KB reads (See Performance Benchmarks#fio test results for outputs).

All tests were executed against by executing benchmark.sh with the following parameters (unless otherwise specified): NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 For long-running tests, the tests were executed with a duration of 5400 seconds (DURATION=5400)

All other parameters used the default values, unless explicitly mentioned here. Tests were executed sequentially against the same database instance. The db_bench tool was generated via "make release".

The following test sequence was executed:

Test 1. Bulk Load of keys in Random Order (benchmark.sh bulkload)

NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 benchmark.sh bulkload

Measure performance to load 900 million keys into the database. The keys are inserted in random order. The database is empty at the beginning of this benchmark run and gradually fills up. No data is being read when the data load is in progress.

Test 2. Random Write (benchmark.sh overwrite)

NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite

Measure performance to randomly overwrite keys into the database. The database was first created by the previous benchmark.

Test 3. Multi-threaded read and single-threaded write (benchmark.sh readwhilewriting)

NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readwhilewriting

Measure performance to randomly read keys and ongoing updates to existing keys. The database from Test #2 was used as the starting point.

Test 4. Random Read (benchmark.sh randomread)

NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh randomread

Measure random read performance of a database.

The following shows results of these tests using various releases and parameters.

Scenario 1: RocksDB 6.10, Different Block Sizes

The test cases were executed with various block sizes. The Direct I/O (DIO) test was executed with an 8K block size. In the "RL" tests, a timed rate-limited operation was place before the reported operation. For example, between the "bulkload" and "overwrite" operations, a 30-minute "rate-limited overwrite (limited at 2MB/sec) was conducted. This timed operation was meant as a means to help guarantee any flush or other background operation happened before the "timed reported" operation, thereby creating more predicatability in the percentile perforamnce numbers.

Test Case 1 : benchmark.sh bulkload

8K: Complete bulkload in 4560 seconds 4K: Complete bulkload in 5215 seconds 16K: Complete bulkload in 3996 seconds DIO: Complete bulkload in 4547 seconds 8K RL: Complete bulkload in 4388 seconds

Block ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Uptime Stall-time Stall% du -s - k
8K 924468 370.3 0.2 157.1 157.1 1.0 167.5 1.1 0.5 0.8 2 4 1119 960 00:03:45.193 23.5 101411592
4K 853217 341.8 0.2 165.3 165.3 1.0 165.9 1.2 0.5 0.8 2 4 1159 1020 00:04:41.465 27.6 108748512
16K 1027567 411.6 0.1 149.0 149.0 1.0 181.6 1.0 0.5 0.8 2 3 1021 840 00:02:23.600 17.1 99070240
DIO 921342 369.0 0.2 156.6 156.6 1.0 167.0 1.1 0.5 0.8 2 4 1104 960 00:03:27.280 21.6 101412440
8K RL 989786 396.5 0.2 159.4 159.4 1.0 179.5 1.0 0.5 0.8 2 4 1043 909 00:02:41.514 17.8 101406496

Test Case 2 : benchmark.sh overwrite

Block ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
8K 85756 34.3 0.1 161.4 739.9 4.5 142.2 373.1 9.7 274.1 5613 25620 47726 00:20:18.388 22.9 159903832
4K 79856 32.0 0.2 166.0 716.9 4.3 136.3 400.7 9.7 268.9 5914 25394 47296 00:25:37.183 28.5 168094916
16K 93678 37.5 0.1 174.4 825.0 4.7 156.8 341.6 9.4 279.2 4453 24796 47038 00:16:24.878 18.3 155953232
DIO 85655 34.3 0.1 163.9 734.9 4.4 140.7 373.6 9.7 263.1 6250 25807 47678 00:18:51.145 21.2 159470752
8K RL 85542 34.3 0.1 161.2 757.8 4.7 143.6 748.1 340.5 735.8 11852 30851 59137 5401 00:08:18.359 9.2

Test Case 3 : benchmark.sh readwhilewriting

Block ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
8K 89285 28.0 0.1 4.2 199.6 47.5 37.9 358.4 281.1 427.9 2935 7587 19029 00:13:7.325 14.6 139287936
4K 116759 36.2 0.1 3.6 203.8 56.6 38.9 274.1 224.4 328.0 2534 6131 13678 00:20:58.789 23.5 147504716
16K 64393 20.4 0.1 4.1 194.0 47.3 36.8 496.9 402.3 642.7 3488 7251 8880 00:10:58.906 12.2 138132068
DIO 98698 30.9 0.1 3.9 197.4 50.6 37.6 324.2 257.7 353.7 2764 6583 13742 00:16:47.979 18.8 139319040
8K RL 101598 31.9 0.1 3.2 97.2 30.3 18.4 629.9 587.5 805.9 3922 6881 19699 5402 00:00:0.054 0.0

Test Case 4 : benchmark.sh randomread

Block ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du =s -k
8K 101647 32.0 0.1 0.0 3.9 0 .7 314.8 410.7 498.8 761 1247 3092 139119060
4K 130846 40.7 0.1 0.0 1.0 0 .1 244.6 291.7 347.5 663 865 2626 147417776
16K 70884 22.6 0.1 0.0 1.3 0 .2 451.4 547.5 715.0 1039 1397 2598 138040824
DIO 144737 45.5 0.1 0.1 0.7 7.0 .1 221.1 239.8 320.9 578 866 2133 139239620
8K RL 105790 33.4 0.1 0.0 0.0 0 605.0 683.0 807.9 1579 3133 6152 5403 139681920

Scenario 2: RocksDB 6.10, 2K Value size, 100M Keys.

The test cases were executed with the default block size and a value size of 2K. Only 100M keys were written to the database. Complete bulkload in 2018 seconds

Test ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Uptime Stall-time Stall% du -s -k
bulkload 272448 537.3 0.1 85.3 85.3 1.0 242.6 3.7 0.7 1.1 3 1105 1285 360 00:03:52.679 64.6 57285876
overwrite 22940 45.2 0.1 229.3 879.4 3.8 169.0 1394.9 212.9 350.7 7603 26151 160352 5328 01:06:21.977 74.7 110458852
readwhilewriting 87093 154.2 0.1 5.4 162.6 30.1 31.0 367.4 369.2 491.9 2209 6302 13544 5360 00:00:1.160 0.0 92081776
readrandom 95666 169.9 0.1 0.0 0.0 0 0 334.5 411.1 498.7 742 1214 2789 5358 00:00:0.000 0.0 92092164

Scenario 3: Different Versions of RocksDB

These tests were executed against different versions of RocksDB, by checking out the corresponding branch and doing a "make release".

Test Case 1 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 benchmark.sh bulkload

6.10.0: Complete bulkload in 4560 seconds 6.3.6: Complete bulkload in 4584 seconds 6.0.2: Complete bulkload in 4668 seconds

Version ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Uptime Stall-time Stall% du -s -k
6.10.0 924468 370.3 0.2 157.1 157.1 1.0 167.5 1.1 0.5 0.8 2 4 1119 960 00:03:45.193 23.5 101411592
6.3.6 921714 369.2 0.2 156.7 156.7 1.0 167.1 1.1 0.5 0.8 2 4 1133 960 00:04:2.070 25.2 101437836
6.0.2 933665 374.0 0.2 158.7 158.7 1.0 169.2 1.1 0.5 0.8 2 4 1105 960 00:03:31.627 22.0 101434096

Test Case 2 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite

Version ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.0 85756 34.3 0.1 161.4 739.9 4.5 142.2 373.1 9.7 274.1 5613 25620 47726 00:20:18.388 22.9 159903832
6.3.6 92328 37.0 0.2 174.0 818.4 4.7 155.4 346.6 8.9 263.8 4432 24581 46753 00:20:24.697 22.7 162288400
6.0.2 86767 34.8 0.2 164.8 740.4 4.4 141.4 368.8 9.8 294.7 5900. 25623 47755 00:17:6.887 19.2 162797372

Test Case 3 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readwhilewriting

Version ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.0 89285 28.0 0.1 4.2 199.6 47.5 37.9 358.4 281.1 427.9 2935 7587 19029 00:13:7.325 14.6 139287936
6.3.6 90189 28.6 0.1 4.1 213.1 51.9 40.6 354.8 288.1 430.2 2781 6357 15268 00:13:58.835 15.6 141082740
6.0.2 90140 28.3 0.1 4.1 209.8 51.1 39.9 355.0 290.1 445.1 2789 6354 15951 00:12:13.384 13.6 139700676

Test Case 4 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readrandom

Version ops/sec mb/sec Size-GB L0_GB Sum_GB W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du -s -k
6.10.0 101647 32.0 0.1 0.0 3.9 0 .7 314.8 410.7 498.8 761 1247 3092 139119060
6.3.6 100168 31.8 0.1 0.0 0.9 0 .1 319.5 411.3 499.2 769 1248 2787 140911608
6.9.2 101023 31.8 0.1 0.0 6.0 0 1.1 316.8 412.5 499.7 763 1239 3900 139423196

Appedix

fio test results

]$ fio --randrepeat=1 --ioengine=sync --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread --numjobs=32 --group_reporting
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=64
...
fio-2.14
Starting 32 processes
Jobs: 3 (f=3): [_(3),r(1),_(1),E(1),_(10),r(1),_(13),r(1),E(1)] [100.0% done] [445.3MB/0KB/0KB /s] [114K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=32): err= 0: pid=28042: Fri Jul 24 01:36:19 2020
  read : io=131072MB, bw=469326KB/s, iops=117331, runt=285980msec
  cpu          : usr=1.29%, sys=3.26%, ctx=33585114, majf=0, minf=297
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=33554432/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=131072MB, aggrb=469325KB/s, minb=469325KB/s, maxb=469325KB/s, mint=285980msec, maxt=285980msec

Disk stats (read/write):
  nvme1n1: ios=33654742/61713, merge=0/40, ticks=8723764/89064, in_queue=8788592, util=100.00%
]$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.14
Starting 1 process
Jobs: 1 (f=1): [r(1)] [100.0% done] [456.3MB/0KB/0KB /s] [117K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28385: Fri Jul 24 01:36:56 2020
  read : io=4096.0MB, bw=547416KB/s, iops=136854, runt=  7662msec
  cpu          : usr=22.20%, sys=48.81%, ctx=144112, majf=0, minf=73
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=547416KB/s, minb=547416KB/s, maxb=547416KB/s, mint=7662msec, maxt=7662msec

Disk stats (read/write):
  nvme1n1: ios=1050868/1904, merge=0/1, ticks=374836/2900, in_queue=370532, util=98.70%

Previous Results

Contents

Clone this wiki locally