WIP safe and sane B-Tree rewrite #16906

Gankra · 2014-08-31T20:01:08Z

Total minimal rewrite of BTree to not use copy/clone at all, and to generally avoid unnecessary mutations. I have not yet implemented the "miscellaneous" collection things like iterators, Ord, Eq, Hash, etc as I want to nail the core stuff down first.

There is one substantial drawback to my implementation: it doesn't allow configuration of B. The benefit of this is that the nodes are made of [T,..n]'s instead of Vecs, putting more data contiguously and reducing the amount of indirection. I have done this under @aturon's recommendation that it would be better to write a BTree implementation that would benefit from the future possibility of generic integer arguments being available (I am not saying they will be added).

One consequence of this is that I currently store the Keys/Values in Options, because it's not clear to me if there's a safe way to work with partially intialized arrays, or to tell Rust to "forget" some or all of the array and not try to run destructors on the elements. I would love if someone could show me how to do that.

In addition, right now I do a lot of indexing which could probably be replaced with iterators to avoid bounds checks. I'm also not leveraging copy instrinsics to bulk shift the elements around. This first draft is emphasizing correctness and safety over performance, but in a way that's easily upgradable. I hope to have the time to start tuning the implementation in the coming days,

The actual code is literally 25% just comments describing the fairly complex logic of a BTree. I've tried to chunk things out into "grokable" functions where possible so that everyone understands what's going on.

Finally, as a quick sanity check, I ran treemap's bench suite on btree (make check-collections -j8 PLEASE_BENCH=1):

test treemap::bench::find_rand_100                         ... bench:        74 ns/iter (+/- 5)
test treemap::bench::find_rand_10_000                      ... bench:       164 ns/iter (+/- 3)
test treemap::bench::find_seq_100                          ... bench:        76 ns/iter (+/- 2)
test treemap::bench::find_seq_10_000                       ... bench:       115 ns/iter (+/- 2)
test treemap::bench::insert_rand_100                       ... bench:       126 ns/iter (+/- 2)
test treemap::bench::insert_rand_10_000                    ... bench:      1025 ns/iter (+/- 16)
test treemap::bench::insert_seq_100                        ... bench:       507 ns/iter (+/- 19)
test treemap::bench::insert_seq_10_000                     ... bench:       834 ns/iter (+/- 15)

test btree::bench::find_rand_100                           ... bench:        86 ns/iter (+/- 4)
test btree::bench::find_rand_10_000                        ... bench:       165 ns/iter (+/- 2)
test btree::bench::find_seq_100                            ... bench:        86 ns/iter (+/- 2)
test btree::bench::find_seq_10_000                         ... bench:       117 ns/iter (+/- 1)
test btree::bench::insert_rand_100                         ... bench:       270 ns/iter (+/- 10)
test btree::bench::insert_rand_10_000                      ... bench:       598 ns/iter (+/- 18)
test btree::bench::insert_seq_100                          ... bench:       386 ns/iter (+/- 8)
test btree::bench::insert_seq_10_000                       ... bench:       459 ns/iter (+/- 4)

As you can see, my implementation is generally competitive with treemap, and is substantially faster during large insertion workloads. And as discussed above, I haven't even begun to tune it; my choice of B is largely arbitrary!

If anyone is a low-level optimizing expert, or just knows good BTree design/tuning, I'd love to get any feedback on this work. If anyone has any other ideas or questions, I'm more than happy to discuss this with you!

Gankra · 2014-08-31T20:06:05Z

If you are wiling to review this, I suggest viewing the whole file here, rather than the diff. This is a true clean-room rewrite.

huonw · 2014-09-01T02:02:34Z

src/libcollections/btree.rs

+// This implementation is largely based on the one described in *Open Data Structures*, which
+// can be freely downloaded at http://opendatastructures.org/, and whose contents are as of this
+// writing (August 2014) freely licensed under the following Creative Commons Attribution
+// License: [CC BY 2.5 CA](http://creativecommons.org/licenses/by/2.5/ca/).


Is this license compatible?

I didn't take any source code from it, I just used the high level analysis and design. I note the license here simply for completeness, and to indicate to readers that the text is free/libre/open if they want to refer to it for more details.

If it's a problem though, I work with the primary author and can easily talk to them about it.

huonw · 2014-09-01T12:11:55Z

There is so much unwinding in this code; at a guess, the landing pads etc. generated by the unwraps and [] indexing would be a significant proportion of the final code (and generally a compile-time and runtime performance drain); I would be very interested in seeing this reduced as much as possible.

I noted several places where this could be remove easily (anything with if is_leaf { ... basically).

Another possibility would be a method on Node like

fn view_parts(&self) -> (&[Option<K>], &[&Node<K,V>], &[Option<V>])

where each slice is cut to exactly the right length. It would be implemented unsafely, to coerce from Option<Box<Node<K,V>>> to &Node<K,V> (with a debug_assert! that the appropriate number of Options are actually Some).

Gankra · 2014-09-01T12:56:39Z

@huonw I think mostly we just need a keys iterator that yields the keys. I imagine something like this:

Keys{ it: self.keys.iter() }
...
fn next() {
    self.it.next().and_then(|x| x.as_ref())
}

(and something similar for mut_keys)

I could also probably pass around keys/values in their Options more, since I think there's some places where I unwrap them for a function boundary, and then just rewrap them at the caller.

I could also probably replaces some foo[i]'s with foo.unsafe_get's?

Gankra · 2014-09-03T17:15:44Z

Okay, sorry I went radio silent on this for a bit. I just moved, so everything was a bit hectic. I can now begin working at cleaning this up and optimizing.

Gankra · 2014-09-05T00:11:22Z

So yeah, I think I'm going to go forward with a RawNode design that would basically be mini RawTables as seen in #16628. The option dance, while being really nice for a correctness and safety check is a serious bloat on the whole thing. Although I haven't reflected too hard on detailed RawNode design, so it might all be rubbish to do.

As such this is going to get pretty aggressively refactored, and so I'm considering just closing this now, pending that work.

That said, the main search/insert/remove logic shouldn't change too much, as RawNodes should just be providing a safe interface to the basic iteration/indexing/shifting/splicing stuff. Further, this is still a decent base to work, as demonstrated by my benchmarks.

Gankra · 2014-09-05T00:43:57Z

Regardless of how I proceed, I've applied many of the suggested fixes. I've also factored out search into a method, which significantly reduced code duplication, unwraps, and indexing while produced much more rustic code! Nice!

This also provides a hint of what RawNode might look like. I started toying with Found and Bound being structs in their own right with relevant interfaces (Bound only supports going to the next node and getting the index; Found gives full access). However I stopped once I settled on a larger refactor.

huonw · 2014-09-05T00:48:15Z

src/libcollections/btree.rs

+        Node {
+            length: 0,
+            // FIXME(Gankro): this is gross, I guess you need a macro? [None, ..capacity] uses copy
+            keys:   [None, None, None, None, None, None, None, None, None, None, None],


huonw · 2014-09-05T01:05:35Z

I opened #16998 which would remove all the unwraps (at least, we could use that infrastructure, somewhat); could you add a fixme?

Gankra · 2014-09-07T19:29:51Z

Updated perf results (on a different machine than the first batch):

test treemap::bench::find_rand_100                         ... bench:        75 ns/iter (+/- 7)
test treemap::bench::find_rand_10_000                      ... bench:       194 ns/iter (+/- 8)
test treemap::bench::find_seq_100                          ... bench:        71 ns/iter (+/- 7)
test treemap::bench::find_seq_10_000                       ... bench:        98 ns/iter (+/- 2)

test treemap::bench::insert_rand_100                       ... bench:       174 ns/iter (+/- 16)
test treemap::bench::insert_rand_10_000                    ... bench:      1171 ns/iter (+/- 35)
test treemap::bench::insert_seq_100                        ... bench:       551 ns/iter (+/- 8)
test treemap::bench::insert_seq_10_000                     ... bench:       904 ns/iter (+/- 12)

test btree::bench::find_rand_100                           ... bench:        95 ns/iter (+/- 9)
test btree::bench::find_rand_10_000                        ... bench:       220 ns/iter (+/- 5)
test btree::bench::find_seq_100                            ... bench:        92 ns/iter (+/- 4)
test btree::bench::find_seq_10_000                         ... bench:       155 ns/iter (+/- 15)

test btree::bench::insert_rand_100                         ... bench:       376 ns/iter (+/- 41)
test btree::bench::insert_rand_10_000                      ... bench:       671 ns/iter (+/- 15)
test btree::bench::insert_seq_100                          ... bench:       440 ns/iter (+/- 5)
test btree::bench::insert_seq_10_000                       ... bench:       543 ns/iter (+/- 12)

Interesting that search seems to have regressed, maybe some #[inline]'s are in order.

Gankra · 2014-09-07T19:31:44Z

@huonw Is the current state sufficient to include as a "okay start" (assuming I impl all the iterator/cmp/hash stuff too), or would you like to see some more refactors and unsafe_get's?

Gankra · 2014-09-08T19:32:05Z

@huonw nvm, I'm going forward with RawNode because I had a flash of design insight that should make the code super nice.

huonw · 2014-09-08T22:54:06Z

I'm very happy to go with the as-safe-as-possible version with Option<T> etc, fwiw.

Gankra · 2014-09-09T00:04:50Z

The current design is frustrating to me for a few reasons:

The Options are hacking around the fact that there we can't have partially initialized arrays
Leaves "have" children, when they shouldn't (and in fact, a leaf never "becomes" internal, so having them there is truly worthless).
We have to choose to either duplicate code for internal and leaf nodes, or be inefficient by uselessly shifting around None's in leaves.
Leaves and Internal nodes are not represented in the type system at all, you just have to know when to do what.

My proposed redesign has the following primary structure:

/// A Raw node handles the common code between Internal and Leaf nodes. Generally, it does all the
/// work assuming the Node is Internal, and for Leaf nodes the edge manipulation should just get
/// optimized away because of how empty types are treated.
struct RawNode<K, V, E> {
    length: uint,
    elems: *mut K,
    marker:   marker::CovariantType<(K, V, E)>,
}

pub struct InternalNode<K, V> {
    raw: RawNode<K, V, Node<K,V>>,
}

pub struct LeafNode<K, V> {
    raw: RawNode<K, V, ()>,
}

pub enum Node<K, V> {
    Internal(InternalNode<K, V>),
    Leaf(LeafNode<K, V>),
}

Where RawNode's "elems" is in fact a buffer allocated to fit all the maximum amount of keys, nodes, and edges a B-Tree node can hold by-value, mirroring the design of RawTable from the current HashMap design.

With this design, we get rid of all the Option dancing, and leaves don't allocate space for edges. Further, by implementing all the operations on RawNode, we get impls for Leaves and Internal nodes at once, while optimizing away the edge manipulation, and letting the distinction appear in the type system.

However, this is all of course very unsafe. So we expose a "safe" interface in the same way that RawTable does: by exposing a cursor-based interface for walking through the node's elements.

Gankra · 2014-09-09T01:58:13Z

Hrm... I'm running into issues with being able to express that all of a RawNode's children should be the same type of Node, which makes handle_underflow super awkward.

Gankra · 2014-09-09T20:38:40Z

@huonw I've experimentally refactored the array-based design to use Vec's and inline nodes rather than arrays and boxed nodes. The performance seems comparable and the implementation is much safer and saner. I gotta go atm, but I'll push up that design as a candidate replacement later today hopefully. In the mean time interested parties can check it out in isolation at: https://github.com/Gankro/rust-datastructures/tree/btree-vec/src/btree

Also: as an added bonus, Vec's permit us to have configurable B again! Whoo!

Gankra · 2014-09-13T19:29:37Z

Closing while I finish this up out-of-tree.

Gankra · 2014-09-15T03:02:42Z

Update: impl is done, just need to write (read: steal from TreeMap) more tests. Need to patch Vec's MoveItems first, though.

…y, r=Veykril feat: limit struct hover display nums follow up rust-lang/rust-analyzer#15847, rust-lang/rust-analyzer#15938

WIP btree rewrite

3938af8

Gankra force-pushed the btree branch from af70d2b to 3938af8 Compare August 31, 2014 20:01

huonw reviewed Sep 1, 2014
View reviewed changes

fix constants

9d27059

tons of doc fixes, refactor search into a method

d229980

huonw reviewed Sep 5, 2014
View reviewed changes

Gankra added 2 commits September 7, 2014 13:17

major btree refactor

c68f356

fix import

810e06c

Gankra force-pushed the btree branch from 331bc47 to 810e06c Compare September 7, 2014 18:15

Gankra closed this Sep 13, 2014

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2024

Auto merge of rust-lang#16906 - Young-Flash:limit_struct_hover_displa…

e52bb8c

…y, r=Veykril feat: limit struct hover display nums follow up rust-lang/rust-analyzer#15847, rust-lang/rust-analyzer#15938

WIP safe and sane B-Tree rewrite #16906

WIP safe and sane B-Tree rewrite #16906

Uh oh!

Conversation

Gankra commented Aug 31, 2014

Uh oh!

Gankra commented Aug 31, 2014

Uh oh!

huonw Sep 1, 2014

Choose a reason for hiding this comment

Uh oh!

Gankra Sep 1, 2014

Choose a reason for hiding this comment

Uh oh!

huonw commented Sep 1, 2014

Uh oh!

Gankra commented Sep 1, 2014

Uh oh!

Gankra commented Sep 3, 2014

Uh oh!

Gankra commented Sep 5, 2014

Uh oh!

Gankra commented Sep 5, 2014

Uh oh!

huonw Sep 5, 2014

Choose a reason for hiding this comment

Uh oh!

huonw commented Sep 5, 2014

Uh oh!

Gankra commented Sep 7, 2014

Uh oh!

Gankra commented Sep 7, 2014

Uh oh!

Gankra commented Sep 8, 2014

Uh oh!

huonw commented Sep 8, 2014

Uh oh!

Gankra commented Sep 9, 2014

Uh oh!

Gankra commented Sep 9, 2014

Uh oh!

Gankra commented Sep 9, 2014

Uh oh!

Gankra commented Sep 13, 2014

Uh oh!

Gankra commented Sep 15, 2014

Uh oh!

Uh oh!