-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interval: add btree implementation of interval.Tree #8867
Conversation
e94edbe
to
5547e8c
Compare
@yaojingguo I'm not going to be able to take a look at this any time soon. @paperstreet, @dt: Can you lend a hand in reviewing this and deciding if the performance improvement is worth pursuing? |
5547e8c
to
8f50c11
Compare
The CircleCi build fails with such a message:
@knz's suggestion is to rebuild. But after many retries, the build still fails. I have locally run |
I just started another run while disabling the CI cache. Let's see what that gives. Also with a different PR I eventually got success by rebasing to a newer develop tip. |
Thanks @yaojingguo, this looks really exciting! Based on the gains we've seen in the past from optimizing these caches, these micro benchmark numbers seem worth pursuing. Before going further on that though, at a high level, I think I'd be interested in keeping the current LLBR Also, given the current stability focus, we probably want split that change up into two PRs: one to master (where any changes will need careful stability review) which just adds the common interface and the minimum changes for that, then the second, with the addition of the btree implementation, can go to develop, where we're more comfortable with non-stability-critical changes right now. Back to the numbers: I ran some of the SQL level benchmarks for 100 and 1000 row operations (
Looks like the new implementation is allocating somewhat more than the old one (which @nvanbenschoten had repeated combed over for any possible extra allocations) and I'm guessing the additional allocs explain the slightly lower performance. What's really promising though is that some of the alloc increases are much larger than the corresponding net time increase, making me optimistic that there are some real perf gains that are just hiding behind those allocs -- if we can track down the later, this could potentially speed those benchmarks up quite a bit. |
5547e8c
to
69032ef
Compare
c3e3f4c
to
efcdbb0
Compare
The new implementation does not use a Tree interface shared by LLRB implementation and B-tree implementation because benchmarks show that such an indirection will cause more memory allocations. Instead, it choose LLRB implementation or B-tree implementation based on tags given at build time. The implementation defaults to LLRB. When I have run the following benchmarks on a Micro BenchmarksThe benchmarks are defined in https://github.com/yaojingguo/benchmark-interval-tree. The following reports are produced by running The report shows that there is no clear winner for tree size 4. But for all bigger tree sizes, B-tree is better than LLRB. The optimizations that I have mentioned in the previous comment are not worth of a try since they both prefer large tree sizes instead of small tree sizes.
|
I'm going to need a few more passes over this, as well as some time looking at cpu and memory profiles to see if we can get any easy perf wins, but so far it's looking good. Thanks for the help @yaojingguo! Reviewed 1 of 9 files at r1, 8 of 11 files at r2. Makefile, line 103 at r2 (raw file):
I think @tamird would like this to be pulled into a separate commit. util/interval/btree_based_interval.go, line 31 at r2 (raw file):
util/interval/btree_based_interval.go, line 47 at r2 (raw file):
util/interval/btree_based_interval.go, line 54 at r2 (raw file):
s util/interval/btree_based_interval.go, line 58 at r2 (raw file):
I've usually seen this referred to as the "minimum degree" of the b-tree. util/interval/btree_based_interval.go, line 71 at r2 (raw file):
huh, this is slightly different than https://github.com/google/btree/blob/9cda4e30bb3bdd4f7e8ae79f795c0aeaf2b2efc3/btree.go#L127. Is theirs just incorrect? util/interval/btree_based_interval.go, line 102 at r2 (raw file):
I fear that this allocates (at least once) on each call... util/interval/btree_based_interval.go, line 117 at r2 (raw file):
Should this be the same as util/interval/btree_based_interval.go, line 151 at r2 (raw file):
long line. util/interval/btree_based_interval.go, line 152 at r2 (raw file):
CRLS? util/interval/btree_based_interval.go, line 155 at r2 (raw file):
I don't think I like calling the field or the type util/interval/btree_based_interval.go, line 179 at r2 (raw file):
The fast parameter should be mentioned in the comment. util/interval/btree_based_interval.go, line 182 at r2 (raw file):
It seems like we always have some variation of setting util/interval/btree_based_interval.go, line 232 at r2 (raw file):
It would be a lot easier to review this if the util/interval/btree_based_interval.go, line 392 at r2 (raw file):
lets pull this method above the two methods it calls. util/interval/btree_based_interval.go, line 437 at r2 (raw file):
Why was util/interval/btree_based_interval.go, line 440 at r2 (raw file):
long line, see style guidelines for function signatures. util/interval/btree_based_interval.go, line 558 at r2 (raw file):
long line util/interval/btree_based_interval_test.go, line 31 at r2 (raw file):
Lets improve this flag name. Maybe util/interval/btree_based_interval_test.go, line 50 at r2 (raw file):
Why not move these sorting methods next to the types themselves? util/interval/btree_based_interval_test.go, line 349 at r2 (raw file):
util/interval/btree_based_interval_test.go, line 491 at r2 (raw file):
Do we need other tests from https://github.com/google/btree/blob/9cda4e30bb3bdd4f7e8ae79f795c0aeaf2b2efc3/btree_test.go? util/interval/llrb_based_interval.go, line 5 at r2 (raw file):
This is all directly copied from Comments from Reviewable |
2b3ba21
to
17673b0
Compare
Makefile, line 103 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 31 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 47 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 54 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 58 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 71 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 102 at r2 (raw file):
|
util/interval/btree_based_interval.go, line 117 at r2 (raw file):
|
Addressed in my below comments.
I will open another PR for the benchmarks after this PR is merged.
Yes, it is not called in any performance critical sections.
I will make the rebase after this round review in order to make it easier for you to review my comments. Review status: 4 of 10 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. util/interval/btree_based_interval.go, line 102 at r2 (raw file):
|
@nvanbenschoten any chance to have a look at my comments? |
@nvanbenschoten still no time to review my comments? |
@nvanbenschoten is on vacation; reassigning to @dt |
looks like nathan's points were addressed? I'm happy to help with the rebase here -- I rebased this over my There's might be some interaction with 839fa7e to look at more closely, but as is, it at least seems to compile. I think we'll want to move away from build tags quickly, so both impls are compiled and tested regularly and thus can't rot, even if we stick with the LLBR as the default for now, until this gets stability testing in production. But that change can be a follow up -- getting the code landed in master is a first step to preventing any refactoring and further rot. Reviewed 3 of 9 files at r1, 2 of 11 files at r2, 1 of 5 files at r4. Comments from Reviewable |
@dt, Thanks for your review.
Yes. I have addressed all nathan's points. And I have checked The reason that the compilation succeeds is that I agree that we should go away with the build tags. If you think that this is the right way to go, I can work out a new commit addressing the following points:
Review status: 5 of 10 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. Comments from Reviewable |
@yaojingguo for the file relocations, if you want you to, you can grab the ref I rebased, b7f45b3, from the for the build tags, I poked a little around and I think we can pretty easily pull a common |
4cbeb43
to
b7f45b3
Compare
@dt I have proceeded based on your commit 7eebf50 from your
How can I fix this Review status: 0 of 11 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. Comments from Reviewable |
@yaojingguo please rebase your branch on a more recent master and push again - the failure above is not your fault and was caused by a regression that was fixed after your branch point. |
Removing the build tags means we're always building both tree impls instead of hiding one via tags. Currently NewTree is hardcoded to the current LLBR impl, but could be be made to switch on an env var
4e2d2ab
to
7afaa91
Compare
@jordanlewis. I have rebased my commits and Team City is happy now. Review status: 0 of 11 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. Comments from Reviewable |
1. Implement Iterator method for btree and add tests. 2. Restore Range.Equal method and fix the wrong logic of Range.Equal. The existing Equal method considers Range A and Range B to be equal if A's Start is equal to B's End and A's End is equal to B's Start. The correct logic is to consider Range A and Range B to be equal if A's Start is equal to B's Start and A's End is equal to B's End. See cockroachdb#16079. 3. Add complile time checks to ensure that btree and llrbTree implement the Tree interface. 4. s/llbrNode/llrbNode/g and s/llbrTree/llrbTree/g. 5. Move comment from llrbTree to Tree.
7afaa91
to
bd687eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it probably makes sense to go ahead and merge this: it passes tests, and, since it leaves NewTree
as returning the current LLRB tree implementation, it doesn't seem too scary.
With it merged, it'll be easier to run experiments with the implementation swapped, profile it, etc.
I haven't read this in detail, but I've skimmed it and it LGTM. |
This is really cool, @yaojingguo. Thanks for sticking with it! |
Nice! |
Fixes #6465
All the benchmarking is done on a Thinkpad 440s laptop with a dual-core 2.10GHz Intel Core i7-4600U processor and 8 GB DDR3 memory. The laptop is running Ubuntu 14.04. The benchmark code is on https://github.com/yaojingguo/benchmark-interval-tree.
Augmentation Approaches
I have tried two approaches to augment a B-tree. One is issue-6465-opt branch. The other is rightmosts-guided-bst branch. The difference between these two approaches is that the latter uses a rightmost interval end array to guide the binary search over
node.interfaces
. In the following benchstat report, the first column is forissue-6465-opt
branch. The second column is forrightmosts-guided-bst
branch.The maintenance of the rightmost interval end array makes
Insert
andDelete
methods slow. Its use also complicates the traversal of the interval tree. As a result, even the traversal is slower. If the array is updated incrementally instead of being rebuilt every time,Insert
andDelete
methods might run faster. But I can't figure out how to speed up the traversal. So I think that the latter is not a viable approach.Optimization
issue-6465 branch is a basic implementation of the former approach. It has a slower
Get
method compared to LLRB based interval tree. Profiling shows that the function call ofToLeft
slows it down. The removal ofToLeft
function call resultsissue-6465-opt
branch. Here is the benchstat report. The first column is forissue-6465
branch. The second column is forissue-6465-opt
branch:B-Tree Degree Benchmark
The following report is produced by benchmarking B-tree degrees:
The following diagram puts the above report into a chart.
Average
means the average of the above four methods. When degree is 32,Average
is minimum. If we assume that these four methods are used uniformly, degree 32 is a reasonable choice.Comparison with LLRB based Interval Tree
For all the reports in this section, the first column is for LLRB based interval tree. The second column is B-tree based interval tree.
The conclusion is that LLRB based interval tree is faster than B-tree based interval tree for small trees. But for large trees, B-tree based interval tree is faster, especially for
Delete
method.The following reports are produced with the use of random byte slice with a random length. Report for random byte slice with a length between 1 and 16:
Report for random byte slice with a length between 1 and 32:
Report for random byte slice with a length between 1 and 64:
Report for random byte slice with a length between 1 and 128:
Report for random byte slice with a length between 1 and 256:
Report for random byte slice with a length between 1 and 512:
Report for random byte slice with a length between 1 and 1024:
All the reports for random byte slice also indicate that B-tree based interval tree is faster than LLRB based interval tree.
Possible Optimizations
256
way or256^2
branching based on the first byte or the first two bytes.common prefix length
for node range start and end. Thensort.Search
only needs to do byte slice comparison starting from thecommon prefix length
.@petermattis, could you please help me review this PR? If you think that
issue-6465-opt branch
is the right way to go. I will go proceed to dig into these optimizations.This change is