test: Benchmark transaction iteration #289

AndrewSisley · 2022-03-09T16:33:56Z

Closes #231

Adds benchmarks for transaction iteration, also adds benchmarks for txn.Get (vs ds.Get) and changes the Get benchmarks to retrieve keys in order (to allow for fair comparison with Iterator).

Also found that the iterator code was no longer being called at all, so I fixed that along with some other tweaks to the prod code.

Benchmarks suggest using Posting Lists for unique secondary indexes is worth it (performance converges at a large number of keys - larger than a transaction permits on a single insert):

goos: linux
goarch: amd64
pkg: github.com/sourcenetwork/defradb/bench/storage
cpu: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0064-8              1980970           602.9 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0128-8              1895881           638.9 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0256-8              1707104           684.3 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0512-8              1490356           807.8 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:1024-8              1000000          1030 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0064-8               983287          1153 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0128-8               927733          1318 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0256-8               826567          1431 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0512-8               714462          1636 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:1024-8               545860          2110 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0064-8             193968          6317 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0128-8             182624          6545 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0256-8             160935          7182 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0512-8             139230          8406 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:1024-8             109748         10825 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0064-8            16831         69352 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0128-8            16522         71230 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0256-8            15152         78350 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0512-8            13263         92387 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:1024-8            10000        117344 ns/op
PASS

goos: linux
goarch: amd64
pkg: github.com/sourcenetwork/defradb/bench/storage
cpu: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0064-8             109674          9963 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0128-8             120912         10114 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0256-8             115107          9992 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0512-8             116306         10985 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:1024-8             111943         10735 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0064-8              69482         18309 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0128-8              68138         17787 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0256-8              67068         17513 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0512-8              65024         17996 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:1024-8              57338         18693 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0064-8            14469         82488 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0128-8            14305         84223 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0256-8            13910         84915 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0512-8            13947         87340 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:1024-8            13196         92424 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0064-8           1434        828690 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0128-8           1358        832671 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0256-8           1339        855471 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0512-8           1368        883331 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:1024-8           1261        945361 ns/op
PASS

codecov · 2022-03-09T16:35:00Z

Codecov Report

Merging #289 (9dc7e0a) into develop (c13c31a) will increase coverage by 0.93%.
The diff coverage is 78.94%.

@@             Coverage Diff             @@
##           develop     #289      +/-   ##
===========================================
+ Coverage    62.37%   63.31%   +0.93%     
===========================================
  Files           84       84              
  Lines         9245     9249       +4     
===========================================
+ Hits          5767     5856      +89     
+ Misses        2871     2776      -95     
- Partials       607      617      +10

Impacted Files	Coverage Δ
query/graphql/planner/planner.go	`71.86% <0.00%> (-0.94%)`	⬇️
store/txn.go	`65.38% <92.30%> (+3.95%)`	⬆️
datastores/badger/v3/iterator.go	`51.57% <100.00%> (+51.57%)`	⬆️

jsimnz

Had some questions here and there.

One thing to note. Not yet, but in the future (once the full benchmark comparison flow is in from shahzad) we will likely limit making changes to both code and benchmarks in the same PR, to avoid relative performance drift.

datastores/badger/v3/iterator.go

bench/storage/utils.go

jsimnz · 2022-03-09T22:54:48Z

bench/storage/utils.go

+	return keys, txn.Commit(ctx)
+}
+
+func getSampledIndex(populationSize int, sampleSize int, i int) int {


Whats the goal of this sample index func, instead of the original random selector? From what I can tell, its just splitting up the keyspace into sample size chunks, then randomly selecting from that chunk.

yes. Big difference is that this keeps the keys ordered (as well as uniformly distributed), which the iterator relies on. Is debatable whether the sort should be included in the iterator benches (and omitted from the Get) - I chose to leave it out and do it this way.

I forgot that the keys were generated in a random order - I have chosen to standardize this (and sort the keys), although the iterator actually handled it well (just a performance hit, I thought it would fail).

Code was incorrectly shimming stores that supported iteration

Is a misreading of the original ipfs query code

jsimnz

LGTM

datastores/badger/v3/iterator.go

AndrewSisley · 2022-03-10T20:07:42Z

One thing to note. Not yet, but in the future (once the full benchmark comparison flow is in from shahzad) we will likely limit making changes to both code and benchmarks in the same PR, to avoid relative performance drift.

Sorry - I forgot about this here, I should have split it.

* Use iterable txn when possible Code was incorrectly shimming stores that supported iteration * Remove prefix suffix Is a misreading of the original ipfs query code * Close plan when no results are found * Validate keys that match end prefix * Get keys in order * Add txn get and iterator benches

AndrewSisley added area/datastore Related to the datastore / storage engine system area/testing Related to any test or testing suite labels Mar 9, 2022

AndrewSisley requested review from orpheuslummis, jsimnz and shahzadlone March 9, 2022 16:33

AndrewSisley self-assigned this Mar 9, 2022

AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch 2 times, most recently from 347f81e to 92bfd26 Compare March 9, 2022 19:29

jsimnz reviewed Mar 9, 2022

View reviewed changes

AndrewSisley added 4 commits March 10, 2022 12:07

Use iterable txn when possible

20724c1

Code was incorrectly shimming stores that supported iteration

Remove prefix suffix

0376abc

Is a misreading of the original ipfs query code

Close plan when no results are found

18956af

Validate keys that match end prefix

e7f6932

AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch from 92bfd26 to 6db3a4d Compare March 10, 2022 17:07

AndrewSisley added 2 commits March 10, 2022 12:23

Get keys in order

d448ba6

Add txn get and iterator benches

9dc7e0a

AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch from 6db3a4d to 9dc7e0a Compare March 10, 2022 17:24

AndrewSisley requested a review from jsimnz March 10, 2022 17:24

jsimnz approved these changes Mar 10, 2022

View reviewed changes

datastores/badger/v3/iterator.go Show resolved Hide resolved

AndrewSisley merged commit 8fe9ab5 into develop Mar 10, 2022

AndrewSisley deleted the sisley/test/I231-sec-key-bench branch March 10, 2022 20:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Benchmark transaction iteration #289

test: Benchmark transaction iteration #289

AndrewSisley commented Mar 9, 2022

codecov bot commented Mar 9, 2022 •

edited

Loading

jsimnz left a comment

jsimnz Mar 9, 2022

AndrewSisley Mar 10, 2022 •

edited

Loading

AndrewSisley Mar 10, 2022

jsimnz left a comment

AndrewSisley commented Mar 10, 2022

test: Benchmark transaction iteration #289

test: Benchmark transaction iteration #289

Conversation

AndrewSisley commented Mar 9, 2022

codecov bot commented Mar 9, 2022 • edited Loading

Codecov Report

jsimnz left a comment

Choose a reason for hiding this comment

jsimnz Mar 9, 2022

Choose a reason for hiding this comment

AndrewSisley Mar 10, 2022 • edited Loading

Choose a reason for hiding this comment

AndrewSisley Mar 10, 2022

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

AndrewSisley commented Mar 10, 2022

codecov bot commented Mar 9, 2022 •

edited

Loading

AndrewSisley Mar 10, 2022 •

edited

Loading