-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: Sample some first iter reads based on size read #1107
Conversation
Doesn't seem to have affected the Cockroach benchmarks, which is mostly good as the previous change had a positive benefit here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for rerunning all the experiments.
Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal and @jbowens)
iterator.go, line 44 at r1 (raw file):
) // Approximate gap in bytes between samples of data read during iteration.
Could you add a similar comment here about the default being 1MB, and mention that the 1MB number originates in LevelDB and add a pointer to #29 (comment)
PR cockroachdb#1098 stopped the first iteration of the read sampling loop in a newly-created iterator from sampling. While this had the desired effect on Cockroach's benchmarks, this had a pretty significant regression on Pebble's own YCSB benchamrks, as a lot of them have very short-lived iterators. This change tries to find a middle ground by sampling some of the first reads, based on how many bytes were read. H/t to Sumeer for coming up with this idea. Another change is to move `1 << 4` off of the readBytesPeriod onto the option, allowing us to control read sampling in both directions by tweaking the option. Previously, read sampling could only be made less frequent or disabled, but not made more frequent. Pebble ycsb benchmarks (old = before this change): ``` name old ops/sec new ops/sec delta ycsb/A/values=1024 80.6k ± 1% 80.2k ± 2% ~ (p=0.421 n=5+5) ycsb/B/values=1024 393k ± 5% 383k ±12% ~ (p=0.548 n=5+5) ycsb/C/values=1024 675k ± 5% 918k ±12% +35.87% (p=0.008 n=5+5) ycsb/D/values=1024 217k ± 4% 208k ±13% ~ (p=0.151 n=5+5) ycsb/E/values=1024 65.5k ± 4% 62.9k ±14% ~ (p=0.310 n=5+5) ycsb/F/values=1024 26.9k ± 2% 26.8k ± 2% ~ (p=0.841 n=5+5) ycsb/A/values=64 816k ± 6% 840k ± 5% ~ (p=0.310 n=5+5) ycsb/B/values=64 962k ± 0% 1138k ± 7% +18.25% (p=0.016 n=4+5) ycsb/C/values=64 1.38M ± 6% 2.17M ±27% +57.57% (p=0.008 n=5+5) ycsb/D/values=64 835k ± 0% 936k ± 7% +12.08% (p=0.016 n=4+5) ycsb/E/values=64 205k ± 1% 216k ± 5% +5.31% (p=0.032 n=4+5) ycsb/F/values=64 239k ± 1% 239k ± 1% ~ (p=0.421 n=5+5) name old read new read delta ycsb/A/values=1024 170G ± 0% 170G ± 1% ~ (p=0.421 n=5+5) ycsb/B/values=1024 118G ± 3% 119G ± 6% ~ (p=0.548 n=5+5) ycsb/C/values=1024 46.1G ± 5% 57.9G ± 4% +25.68% (p=0.008 n=5+5) ycsb/D/values=1024 131G ± 4% 129G ±11% ~ (p=0.690 n=5+5) ycsb/E/values=1024 64.7G ± 3% 64.2G ± 5% ~ (p=1.000 n=5+5) ycsb/F/values=1024 169G ± 1% 169G ± 0% ~ (p=1.000 n=5+5) ycsb/A/values=64 69.6G ± 0% 71.2G ± 4% ~ (p=0.730 n=4+5) ycsb/B/values=64 12.5G ± 1% 39.5G ± 5% +216.43% (p=0.016 n=4+5) ycsb/C/values=64 1.53G ±18% 2.82G ± 5% +84.57% (p=0.008 n=5+5) ycsb/D/values=64 28.6G ± 1% 39.8G ± 9% +39.37% (p=0.016 n=4+5) ycsb/E/values=64 15.8G ±13% 15.8G ± 8% ~ (p=0.841 n=5+5) ycsb/F/values=64 162G ± 1% 163G ± 1% ~ (p=0.690 n=5+5) name old write new write delta ycsb/A/values=1024 199G ± 0% 198G ± 1% ~ (p=0.310 n=5+5) ycsb/B/values=1024 133G ± 3% 134G ± 7% ~ (p=0.548 n=5+5) ycsb/C/values=1024 46.1G ± 5% 57.9G ± 4% +25.68% (p=0.008 n=5+5) ycsb/D/values=1024 145G ± 4% 142G ±11% ~ (p=0.690 n=5+5) ycsb/E/values=1024 68.9G ± 3% 68.2G ± 6% ~ (p=1.000 n=5+5) ycsb/F/values=1024 203G ± 1% 203G ± 1% ~ (p=1.000 n=5+5) ycsb/A/values=64 100G ± 0% 103G ± 4% ~ (p=0.190 n=4+5) ycsb/B/values=64 16.2G ± 1% 43.7G ± 4% +169.23% (p=0.016 n=4+5) ycsb/C/values=64 1.48G ±18% 2.73G ± 5% +84.28% (p=0.008 n=5+5) ycsb/D/values=64 33.7G ± 0% 45.6G ± 8% +35.19% (p=0.016 n=4+5) ycsb/E/values=64 17.0G ±13% 17.1G ± 7% ~ (p=0.690 n=5+5) ycsb/F/values=64 192G ± 1% 193G ± 1% ~ (p=0.690 n=5+5) name old r-amp new r-amp delta ycsb/A/values=1024 14.4 ± 5% 15.1 ±11% ~ (p=0.421 n=5+5) ycsb/B/values=1024 7.13 ± 1% 7.07 ± 2% ~ (p=0.841 n=5+5) ycsb/C/values=1024 4.32 ±13% 2.77 ± 3% -35.80% (p=0.008 n=5+5) ycsb/D/values=1024 8.81 ± 2% 9.01 ± 4% ~ (p=0.310 n=5+5) ycsb/E/values=1024 13.1 ± 3% 13.5 ± 7% ~ (p=0.690 n=5+5) ycsb/F/values=1024 0.00 0.00 ~ (all equal) ycsb/A/values=64 6.27 ± 3% 6.43 ± 2% ~ (p=0.230 n=5+5) ycsb/B/values=64 4.20 ± 0% 3.73 ± 1% -11.14% (p=0.008 n=5+5) ycsb/C/values=64 3.09 ± 1% 2.13 ± 1% -31.06% (p=0.000 n=5+4) ycsb/D/values=64 4.44 ± 1% 4.17 ± 1% -6.08% (p=0.008 n=5+5) ycsb/E/values=64 3.26 ± 4% 3.21 ± 2% ~ (p=0.135 n=5+5) ycsb/F/values=64 0.00 0.00 ~ (all equal) name old w-amp new w-amp delta ycsb/A/values=1024 7.58 ± 1% 7.57 ± 2% ~ (p=0.413 n=5+5) ycsb/B/values=1024 10.4 ± 1% 10.7 ± 6% ~ (p=0.198 n=5+5) ycsb/C/values=1024 0.00 0.00 ~ (all equal) ycsb/D/values=1024 20.6 ± 1% 21.0 ± 4% ~ (p=0.151 n=5+5) ycsb/E/values=1024 32.3 ± 3% 33.5 ± 9% ~ (p=0.421 n=5+5) ycsb/F/values=1024 11.6 ± 2% 11.7 ± 2% ~ (p=0.651 n=5+5) ycsb/A/values=64 3.34 ± 1% 3.29 ± 0% -1.57% (p=0.016 n=5+4) ycsb/B/values=64 4.57 ± 1% 10.41 ±10% +127.98% (p=0.008 n=5+5) ycsb/C/values=64 0.00 0.00 ~ (all equal) ycsb/D/values=64 10.9 ± 1% 13.1 ± 3% +20.40% (p=0.008 n=5+5) ycsb/E/values=64 21.9 ± 8% 21.4 ± 5% ~ (p=0.690 n=5+5) ycsb/F/values=64 10.9 ± 0% 10.9 ± 1% ~ (p=0.524 n=5+5) ```
ff54f93
to
760370f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! updated PR description as the first line got unexpectedly chopped off in the previous push.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)
iterator.go, line 44 at r1 (raw file):
Previously, sumeerbhola wrote…
Could you add a similar comment here about the default being 1MB, and mention that the 1MB number originates in LevelDB and add a pointer to #29 (comment)
Done.
PR #1098 stopped the first iteration of the read sampling
loop in a newly-created iterator from sampling. While
this had the desired effect on Cockroach's benchmarks,
this had a pretty significant regression on Pebble's own
YCSB benchamrks, as a lot of them have very short-lived
iterators.
This change tries to find a middle ground by sampling some
of the first reads, based on how many bytes were read. H/t
to Sumeer for coming up with this idea.
Another change is to move
1 << 4
off of the readBytesPeriodonto the option, allowing us to control read sampling in
both directions by tweaking the option. Previously, read
sampling could only be made less frequent or disabled,
but not made more frequent.
Pebble ycsb benchmarks (old = before this change):