Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Benchmark transaction iteration #289

Merged
merged 6 commits into from
Mar 10, 2022

Conversation

AndrewSisley
Copy link
Contributor

Closes #231

Adds benchmarks for transaction iteration, also adds benchmarks for txn.Get (vs ds.Get) and changes the Get benchmarks to retrieve keys in order (to allow for fair comparison with Iterator).

Also found that the iterator code was no longer being called at all, so I fixed that along with some other tweaks to the prod code.

Benchmarks suggest using Posting Lists for unique secondary indexes is worth it (performance converges at a large number of keys - larger than a transaction permits on a single insert):

goos: linux
goarch: amd64
pkg: github.com/sourcenetwork/defradb/bench/storage
cpu: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0064-8              1980970           602.9 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0128-8              1895881           638.9 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0256-8              1707104           684.3 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:0512-8              1490356           807.8 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_1_1/ValueSize:1024-8              1000000          1030 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0064-8               983287          1153 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0128-8               927733          1318 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0256-8               826567          1431 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:0512-8               714462          1636 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_2_2/ValueSize:1024-8               545860          2110 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0064-8             193968          6317 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0128-8             182624          6545 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0256-8             160935          7182 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:0512-8             139230          8406 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_10_10/ValueSize:1024-8             109748         10825 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0064-8            16831         69352 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0128-8            16522         71230 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0256-8            15152         78350 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:0512-8            13263         92387 ns/op
Benchmark_Storage_Simple_Txn_Read_Sync_100_100/ValueSize:1024-8            10000        117344 ns/op
PASS
goos: linux
goarch: amd64
pkg: github.com/sourcenetwork/defradb/bench/storage
cpu: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0064-8             109674          9963 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0128-8             120912         10114 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0256-8             115107          9992 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:0512-8             116306         10985 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_1_1_1/ValueSize:1024-8             111943         10735 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0064-8              69482         18309 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0128-8              68138         17787 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0256-8              67068         17513 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:0512-8              65024         17996 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_2_1_2/ValueSize:1024-8              57338         18693 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0064-8            14469         82488 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0128-8            14305         84223 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0256-8            13910         84915 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:0512-8            13947         87340 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_10_1_10/ValueSize:1024-8            13196         92424 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0064-8           1434        828690 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0128-8           1358        832671 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0256-8           1339        855471 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:0512-8           1368        883331 ns/op
Benchmark_Storage_Simple_Txn_Iterator_Sync_100_1_100/ValueSize:1024-8           1261        945361 ns/op
PASS

@AndrewSisley AndrewSisley added area/datastore Related to the datastore / storage engine system area/testing Related to any test or testing suite labels Mar 9, 2022
@AndrewSisley AndrewSisley self-assigned this Mar 9, 2022
@codecov
Copy link

codecov bot commented Mar 9, 2022

Codecov Report

Merging #289 (9dc7e0a) into develop (c13c31a) will increase coverage by 0.93%.
The diff coverage is 78.94%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #289      +/-   ##
===========================================
+ Coverage    62.37%   63.31%   +0.93%     
===========================================
  Files           84       84              
  Lines         9245     9249       +4     
===========================================
+ Hits          5767     5856      +89     
+ Misses        2871     2776      -95     
- Partials       607      617      +10     
Impacted Files Coverage Δ
query/graphql/planner/planner.go 71.86% <0.00%> (-0.94%) ⬇️
store/txn.go 65.38% <92.30%> (+3.95%) ⬆️
datastores/badger/v3/iterator.go 51.57% <100.00%> (+51.57%) ⬆️

@AndrewSisley AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch 2 times, most recently from 347f81e to 92bfd26 Compare March 9, 2022 19:29
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some questions here and there.

One thing to note. Not yet, but in the future (once the full benchmark comparison flow is in from shahzad) we will likely limit making changes to both code and benchmarks in the same PR, to avoid relative performance drift.

datastores/badger/v3/iterator.go Outdated Show resolved Hide resolved
datastores/badger/v3/iterator.go Show resolved Hide resolved
bench/storage/utils.go Show resolved Hide resolved
return keys, txn.Commit(ctx)
}

func getSampledIndex(populationSize int, sampleSize int, i int) int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the goal of this sample index func, instead of the original random selector? From what I can tell, its just splitting up the keyspace into sample size chunks, then randomly selecting from that chunk.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. Big difference is that this keeps the keys ordered (as well as uniformly distributed), which the iterator relies on. Is debatable whether the sort should be included in the iterator benches (and omitted from the Get) - I chose to leave it out and do it this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that the keys were generated in a random order - I have chosen to standardize this (and sort the keys), although the iterator actually handled it well (just a performance hit, I thought it would fail).

Code was incorrectly shimming stores that supported iteration
Is a misreading of the original ipfs query code
@AndrewSisley AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch from 92bfd26 to 6db3a4d Compare March 10, 2022 17:07
@AndrewSisley AndrewSisley force-pushed the sisley/test/I231-sec-key-bench branch from 6db3a4d to 9dc7e0a Compare March 10, 2022 17:24
@AndrewSisley AndrewSisley requested a review from jsimnz March 10, 2022 17:24
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

datastores/badger/v3/iterator.go Show resolved Hide resolved
@AndrewSisley
Copy link
Contributor Author

One thing to note. Not yet, but in the future (once the full benchmark comparison flow is in from shahzad) we will likely limit making changes to both code and benchmarks in the same PR, to avoid relative performance drift.

Sorry - I forgot about this here, I should have split it.

@AndrewSisley AndrewSisley merged commit 8fe9ab5 into develop Mar 10, 2022
@AndrewSisley AndrewSisley deleted the sisley/test/I231-sec-key-bench branch March 10, 2022 20:07
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
* Use iterable txn when possible

Code was incorrectly shimming stores that supported iteration

* Remove prefix suffix

Is a misreading of the original ipfs query code

* Close plan when no results are found

* Validate keys that match end prefix

* Get keys in order

* Add txn get and iterator benches
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datastore Related to the datastore / storage engine system area/testing Related to any test or testing suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluate probable performance differences between secondary index key structure
2 participants