-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: optimize MVCCGarbageCollect for large numbers of versions #51184
Merged
craig
merged 3 commits into
cockroachdb:master
from
ajwerner:ajwerner/optimize-MVCCGarbageCollect
Jul 13, 2020
Merged
storage: optimize MVCCGarbageCollect for large numbers of versions #51184
craig
merged 3 commits into
cockroachdb:master
from
ajwerner:ajwerner/optimize-MVCCGarbageCollect
Jul 13, 2020
Commits on Jul 13, 2020
-
storage: unify BenchmarkGarbageCollect between engines
This commit unifies `BenchmarkGarbageCollect` in the style of `BenchmarkExportToSst` and moves both from engine-specific files to bench_test.go Release note: None
Configuration menu - View commit details
-
Copy full SHA for 422b725 - Browse repository at this point
Copy the full SHA 422b725View commit details -
storage: augment BenchmarkGarbageCollect to different deletion ratios
Before this change, BenchmarkGarbageCollect always benchmarked collecting all but the last version of a key. This change expands the test to benchmark different numbers of deleted versions to highlight that the current implementation is linear in the number of versions. Release note: None
Configuration menu - View commit details
-
Copy full SHA for 035f8a8 - Browse repository at this point
Copy the full SHA 035f8a8View commit details -
storage: optimize MVCCGarbageCollect
Prior to this change, MVCCGarbageCollect performed a linear scan of all versions of a key, not just the versions being garbage collected. Given the pagination of deleting versions above this call, the linear behavior can result in quadratic runtime of GC when the number of versions vastly exceeds the page size. The benchmark results demonstrate the change's effectiveness. It's worth noting that for a single key with a single version, the change has a negative performance impact. I suspect this is due to the allocation of a key in order to construct the iterator. In cases involving more keys, I theorize the positive change is due to the fact that now the iterator is never seeked backwards due to the sorting of the keys. It's worth noting that since 20.1, the GC queue has been sending keys in the GC request in reverse order. I anticipate that this sorting is likely a good thing in that case too. The stepping optimization seemed important in the microbenchmarks for cases where most of the data was garbage. Without it, the change had small negative impact on performance. ``` name old time/op new time/op delta MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=2/deleteVersions=1-24 3.39µs ± 1% 3.96µs ± 0% +16.99% (p=0.004 n=6+5) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1-24 319µs ± 3% 10µs ±12% -96.88% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=16-24 319µs ± 2% 16µs ±10% -94.95% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=32-24 319µs ± 3% 21µs ± 5% -93.52% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=512-24 337µs ± 1% 182µs ± 3% -46.00% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1015-24 361µs ± 0% 353µs ± 2% -2.32% (p=0.010 n=4+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1023-24 361µs ± 3% 350µs ± 2% -3.14% (p=0.009 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=2/deleteVersions=1-24 2.00ms ± 3% 2.25ms ± 2% +12.53% (p=0.004 n=6+5) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=1-24 388ms ± 3% 16ms ± 5% -95.76% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=16-24 387ms ± 1% 27ms ± 3% -93.14% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=32-24 393ms ± 5% 35ms ± 4% -91.09% (p=0.002 n=6+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=512-24 463ms ± 4% 276ms ± 3% -40.43% (p=0.004 n=5+6) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=1015-24 539ms ± 5% 514ms ± 3% -4.64% (p=0.016 n=5+5) MVCCGarbageCollect/rocksdb/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=1023-24 533ms ± 4% 514ms ± 1% ~ (p=0.093 n=6+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=2/deleteVersions=1-24 1.97µs ± 3% 2.29µs ± 2% +16.58% (p=0.002 n=6+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1-24 139µs ± 1% 5µs ± 6% -96.40% (p=0.004 n=5+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=16-24 140µs ± 1% 8µs ± 1% -94.13% (p=0.004 n=6+5) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=32-24 143µs ± 4% 11µs ± 2% -92.03% (p=0.002 n=6+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=512-24 178µs ± 9% 109µs ± 1% -38.75% (p=0.004 n=6+5) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1015-24 201µs ± 1% 213µs ± 1% +5.80% (p=0.008 n=5+5) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1/numVersions=1024/deleteVersions=1023-24 205µs ±11% 215µs ± 6% ~ (p=0.126 n=5+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=2/deleteVersions=1-24 1.43ms ± 1% 1.34ms ± 1% -5.82% (p=0.004 n=6+5) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=1-24 218ms ± 9% 9ms ± 2% -96.00% (p=0.002 n=6+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=16-24 216ms ± 3% 15ms ± 2% -93.19% (p=0.004 n=5+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=32-24 219ms ± 4% 20ms ± 5% -90.77% (p=0.004 n=5+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=512-24 303ms ± 4% 199ms ± 4% -34.47% (p=0.004 n=5+6) MVCCGarbageCollect/pebble/keySize=128/valSize=128/numKeys=1024/numVersions=1024/deleteVersions=1015-24 382ms ±16% 363ms ± 8% ~ (p=0.485 n=6+6) ajwerner@gceworker-ajwerner:~/go/src/github.com/cockroachdb/cockroach$ %ns=1024/deleteVersions=1023-24 363ms ± 4% 354ms ± 4% ~ (p=0.222 n=5+5) ``` Release note (performance improvement): Improved the efficiency of garbage collection when there are a large number of versions of a single key, commonly found when utilizing sequences.
Configuration menu - View commit details
-
Copy full SHA for 8e5423b - Browse repository at this point
Copy the full SHA 8e5423bView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.