Slimmify BTree by replacing the three Vecs in Node with a manually alloc... #18028

gereeter · 2014-10-14T10:02:23Z

...ated buffer.

Before:

test btree::map::bench::find_rand_100                      ... bench:        29 ns/iter (+/- 2)
test btree::map::bench::find_rand_10_000                   ... bench:        83 ns/iter (+/- 6)
test btree::map::bench::find_seq_100                       ... bench:        30 ns/iter (+/- 1)
test btree::map::bench::find_seq_10_000                    ... bench:        50 ns/iter (+/- 3)
test btree::map::bench::insert_rand_100                    ... bench:       186 ns/iter (+/- 30)
test btree::map::bench::insert_rand_10_000                 ... bench:       377 ns/iter (+/- 8)
test btree::map::bench::insert_seq_100                     ... bench:       299 ns/iter (+/- 10)
test btree::map::bench::insert_seq_10_000                  ... bench:       368 ns/iter (+/- 12)
test btree::map::bench::iter_1000                          ... bench:     20956 ns/iter (+/- 479)
test btree::map::bench::iter_100000                        ... bench:   2060899 ns/iter (+/- 44325)
test btree::map::bench::iter_20                            ... bench:       560 ns/iter (+/- 63)

After:

test btree::map::bench::find_rand_100                      ... bench:        28 ns/iter (+/- 2)
test btree::map::bench::find_rand_10_000                   ... bench:        74 ns/iter (+/- 3)
test btree::map::bench::find_seq_100                       ... bench:        31 ns/iter (+/- 0)
test btree::map::bench::find_seq_10_000                    ... bench:        46 ns/iter (+/- 0)
test btree::map::bench::insert_rand_100                    ... bench:       141 ns/iter (+/- 1)
test btree::map::bench::insert_rand_10_000                 ... bench:       273 ns/iter (+/- 12)
test btree::map::bench::insert_seq_100                     ... bench:       255 ns/iter (+/- 17)
test btree::map::bench::insert_seq_10_000                  ... bench:       340 ns/iter (+/- 3)
test btree::map::bench::iter_1000                          ... bench:     21193 ns/iter (+/- 1958)
test btree::map::bench::iter_100000                        ... bench:   2203599 ns/iter (+/- 100491)
test btree::map::bench::iter_20                            ... bench:       614 ns/iter (+/- 110)

This code could probably be a fair bit cleaner, but it works.

Part of #18009.

rust-highfive · 2014-10-14T10:02:29Z

Warning

These commits modify unsafe code. Please review it carefully!

gereeter · 2014-10-14T10:05:04Z

cc @gankro

huonw · 2014-10-14T10:35:17Z

src/libcollections/btree/node.rs

+
+    // At any given time, there will be `_len` keys, `_len` values, and (in an internal node)
+    // `_len + 1` edges. In a leaf node, there will never be any edges.
+    _len: uint,


I wanted to make sure that I used the function len() as much as possible, as I knew that the field was likely to be modified over time. It basically made me make the code easier to change.

huonw · 2014-10-14T10:42:27Z

FWIW, there's another possible tricks along these lines: we could store length and capacity inside the allocation, cutting off two inline words, that, allocate {uint, uint, keys..., values..., subnodes...} and so store 3 words instead of 5 (possibly reordered to be placed to make the alignments work more nicely).

Gankra · 2014-10-14T11:24:34Z

src/libcollections/btree/node.rs

+        1
+    } else {
+        mem::min_align_of::<Node<K, V>>()
+    };


minor style (maybe perf?) nit, these two branches could be collapsed into let (edges_size, edges_align) = ...

I presume that if I collapse the edges, I should also collapse the keys and vals, right?

I don't think that's as big of a deal, since they're set unconditionally. You certainly could for symmetry, though.

Fixed. Interestingly, Github moved this comment to a different function from the one it was originally in.

gereeter · 2014-10-14T13:12:52Z

@huonw I'm a bit worried about putting the length and capacity behind the allocation because it seems like it would be more inefficient to access/mutate them behind an extra layer of indirection. I can try it, but this seems like the same issue that made Vec faster than ~[].

Gankra · 2014-10-14T13:52:21Z

A bigger win, percentage-wise, would be just to use u8's or u16's for the len/capacity. I really don't think nodes bigger than 256 elems are that reasonable to expect. 16000 or whatever is definitely silly. Of course this will require a fair amount of (totally safe) upcasting, and may require some care to avoid overflows.

huonw · 2014-10-14T14:07:37Z

@gereeter this differs to ~[] vs. Vec somewhat because these allocations are never reallocated. (This can definitely be a later experiment, it does not have to be part of this PR.)

Gankra · 2014-10-14T16:44:58Z

src/libcollections/btree/node.rs

-        // Pop it
-        let key = self.keys.pop().unwrap();
-        let val = self.vals.pop().unwrap();
+        unsafe {


I think this code needs a guard for self.len() == 0, at very least.

Gankra · 2014-10-14T19:48:25Z

High level, this seems pretty good. Some of the implementation details are pretty dicey, though. This would benefit from some more documentation.

@huonw What's the general policy on when to use assert vs debug_assert?

Gankra · 2014-10-25T00:26:51Z

@gereeter What's the status on this?

Gankra · 2014-12-10T03:09:17Z

src/libcollections/btree/map.rs

@@ -506,130 +562,78 @@ mod stack {
            }
        }

+        pub fn with<T, F: for<'id> FnOnce(Pusher<'id, 'a, K, V>,


docs... etc

Gankra · 2014-12-10T03:23:50Z

Basically at this point my only complaint is all the new rad tech needs to really be doc'd. Impl seems pretty solid, although BTree's gotten to the point that it's hard to keep it all in my head.

gereeter · 2014-12-10T12:32:00Z

Docs!

Also, sorry about the rebase - I keep forgetting how annoying force pushing can be.

Gankra · 2014-12-10T13:54:06Z

Awesome! Beyond minor doc nits this LGTM. I definitely want a second opinion though. There's some crazy stuff in here. r=me with doc fixes and if @huonw @cgaebel or maybe even @nikomatsakis (dat InvariantLifetime usage) also signs off on it.

gereeter · 2014-12-10T18:11:44Z

Doc fixes are in.

cgaebel · 2014-12-10T18:18:40Z

src/libcollections/btree/map.rs

+                let map = self.map;
+                map.length += 1;
+
+                let mut stack = self.stack;


Why move stack (and map) out of self? A move like this implies generating code for a memcpy and memset.

I did it because I was following the original code as closely as possible. However, it does seem to be unnecessary, so I can simplify it if you wish.

cgaebel · 2014-12-10T23:14:04Z

This looks good to me. Definitely an improvement over what was there before.

gereeter · 2014-12-10T23:17:54Z

Okay, fixed the doc fix typo and did @cgaebel's recommended simplification of insert.

Gankra · 2014-12-11T03:38:26Z

Okay I think this is good to go! r=me with squashing, unless @huonw wants to take another crack at it.

Edit: huon does. Moar review!

huonw · 2014-12-11T03:44:52Z

src/libcollections/btree/node.rs

+    // Despite this, we store three separate pointers to the three "chunks" of the buffer because
+    // the performance drops significantly if the locations of the vals and edges need to be
+    // recalculated upon access.
+    keys: *mut K,


These two pointers will always be valid except when they have be zeroed to cancel the destructor, right?

(I would like see a commend to this effect, basically justifying the unsafe_no_drop_flag.)

Yes, keys and vals will only be null if the structure has only been dropped.

Gankra · 2014-12-11T03:58:47Z

Oh, since this is still on-going, one thing I'd really like to see is a more robust fuzzing of BTreeMap in the tests now that it's built on all these raw components. I know for a fact there's at least one case in the old logic that test_basic didn't cover (and I forgot to update the tests when I fixed a bug related to it during development).

huonw · 2014-12-11T04:03:12Z

src/libcollections/btree/node.rs

+    }
+}
+
+// Vector functions (all unchecked)


What does 'vector' mean in this case?

It means that these functions are acting on Nodes as if they were meaningless bunches of parallel vectors. These functions are direct translations of ones implemented on Vec, like insert, push, and pop. The comment is fairly meaningless, so I might just remove it.

huonw · 2014-12-11T04:13:18Z

Looks pretty reasonable to me, but the code is quite complicated and there's no way I can hold it all in my head especially in my current travel-addled state.

gereeter · 2014-12-11T04:33:05Z

@huonw I think I've fixed all the issues you brought up.
@gankro If you want fuzz testing, I think the ideal would be just doing a truly random sequence of operations on all the Map types at once, checking that they all return identical results.

Gankra · 2014-12-11T04:35:34Z

@gereeter Yes, I'd like more uniform collection testing in the future. This would require either a pile of macros or collection traits, though. Anyway I don't want to block this on fuzzing. This has sat in the queue for too long!

gereeter · 2014-12-11T13:09:52Z

Should I squash, or is this waiting for some more review?

Gankra · 2014-12-12T03:40:07Z

YES!

Squash iiiiiit

🎊

cgaebel · 2014-12-12T03:42:22Z

…held three separately allocated `Vec`s, with a manually allocated buffer. Additionally, restructure the node and stack interfaces to be safer and require fewer bounds checks. Before: test btree::map::bench::find_rand_100 ... bench: 35 ns/iter (+/- 2) test btree::map::bench::find_rand_10_000 ... bench: 88 ns/iter (+/- 3) test btree::map::bench::find_seq_100 ... bench: 36 ns/iter (+/- 1) test btree::map::bench::find_seq_10_000 ... bench: 62 ns/iter (+/- 0) test btree::map::bench::insert_rand_100 ... bench: 157 ns/iter (+/- 8) test btree::map::bench::insert_rand_10_000 ... bench: 413 ns/iter (+/- 8) test btree::map::bench::insert_seq_100 ... bench: 272 ns/iter (+/- 10) test btree::map::bench::insert_seq_10_000 ... bench: 369 ns/iter (+/- 19) test btree::map::bench::iter_1000 ... bench: 19049 ns/iter (+/- 740) test btree::map::bench::iter_100000 ... bench: 1916737 ns/iter (+/- 102250) test btree::map::bench::iter_20 ... bench: 424 ns/iter (+/- 40) After: test btree::map::bench::find_rand_100 ... bench: 9 ns/iter (+/- 1) test btree::map::bench::find_rand_10_000 ... bench: 8 ns/iter (+/- 0) test btree::map::bench::find_seq_100 ... bench: 7 ns/iter (+/- 0) test btree::map::bench::find_seq_10_000 ... bench: 8 ns/iter (+/- 0) test btree::map::bench::insert_rand_100 ... bench: 136 ns/iter (+/- 5) test btree::map::bench::insert_rand_10_000 ... bench: 380 ns/iter (+/- 34) test btree::map::bench::insert_seq_100 ... bench: 255 ns/iter (+/- 8) test btree::map::bench::insert_seq_10_000 ... bench: 364 ns/iter (+/- 10) test btree::map::bench::iter_1000 ... bench: 19112 ns/iter (+/- 837) test btree::map::bench::iter_100000 ... bench: 1911961 ns/iter (+/- 33069) test btree::map::bench::iter_20 ... bench: 453 ns/iter (+/- 37)

gereeter · 2014-12-12T13:03:39Z

Squashed!

...ated buffer. Before: test btree::map::bench::find_rand_100 ... bench: 29 ns/iter (+/- 2) test btree::map::bench::find_rand_10_000 ... bench: 83 ns/iter (+/- 6) test btree::map::bench::find_seq_100 ... bench: 30 ns/iter (+/- 1) test btree::map::bench::find_seq_10_000 ... bench: 50 ns/iter (+/- 3) test btree::map::bench::insert_rand_100 ... bench: 186 ns/iter (+/- 30) test btree::map::bench::insert_rand_10_000 ... bench: 377 ns/iter (+/- 8) test btree::map::bench::insert_seq_100 ... bench: 299 ns/iter (+/- 10) test btree::map::bench::insert_seq_10_000 ... bench: 368 ns/iter (+/- 12) test btree::map::bench::iter_1000 ... bench: 20956 ns/iter (+/- 479) test btree::map::bench::iter_100000 ... bench: 2060899 ns/iter (+/- 44325) test btree::map::bench::iter_20 ... bench: 560 ns/iter (+/- 63) After: test btree::map::bench::find_rand_100 ... bench: 28 ns/iter (+/- 2) test btree::map::bench::find_rand_10_000 ... bench: 74 ns/iter (+/- 3) test btree::map::bench::find_seq_100 ... bench: 31 ns/iter (+/- 0) test btree::map::bench::find_seq_10_000 ... bench: 46 ns/iter (+/- 0) test btree::map::bench::insert_rand_100 ... bench: 141 ns/iter (+/- 1) test btree::map::bench::insert_rand_10_000 ... bench: 273 ns/iter (+/- 12) test btree::map::bench::insert_seq_100 ... bench: 255 ns/iter (+/- 17) test btree::map::bench::insert_seq_10_000 ... bench: 340 ns/iter (+/- 3) test btree::map::bench::iter_1000 ... bench: 21193 ns/iter (+/- 1958) test btree::map::bench::iter_100000 ... bench: 2203599 ns/iter (+/- 100491) test btree::map::bench::iter_20 ... bench: 614 ns/iter (+/- 110) This code could probably be a fair bit cleaner, but it works. Part of #18009.

fix: lifetime hint panic in non generic defs

huonw reviewed Oct 14, 2014
View reviewed changes

Gankra reviewed Oct 14, 2014
View reviewed changes

gereeter force-pushed the slimmer-btree-node branch from 6ef8c32 to ab92ca1 Compare October 25, 2014 15:11

Gankra reviewed Dec 10, 2014
View reviewed changes

cgaebel reviewed Dec 10, 2014
View reviewed changes

huonw reviewed Dec 11, 2014
View reviewed changes

gereeter force-pushed the slimmer-btree-node branch from bbd737b to 23e77ac Compare December 12, 2014 12:47

gereeter force-pushed the slimmer-btree-node branch from 23e77ac to 130fb08 Compare December 12, 2014 13:02

bors merged commit 130fb08 into rust-lang:master Dec 12, 2014

gereeter deleted the slimmer-btree-node branch December 17, 2015 01:30

lnicola pushed a commit to lnicola/rust that referenced this pull request Sep 25, 2024

Auto merge of rust-lang#18028 - Veykril:lifetime-hints-panic, r=Veykril

090a38c

fix: lifetime hint panic in non generic defs

Slimmify BTree by replacing the three Vecs in Node with a manually alloc... #18028

Slimmify BTree by replacing the three Vecs in Node with a manually alloc... #18028

Conversation

gereeter commented Oct 14, 2014

Uh oh!

rust-highfive commented Oct 14, 2014

Uh oh!

gereeter commented Oct 14, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huonw commented Oct 14, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gereeter commented Oct 14, 2014

Uh oh!

Gankra commented Oct 14, 2014

Uh oh!

huonw commented Oct 14, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra commented Oct 14, 2014

Uh oh!

Gankra commented Oct 25, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra commented Dec 10, 2014

Uh oh!

gereeter commented Dec 10, 2014

Uh oh!

Gankra commented Dec 10, 2014

Uh oh!

gereeter commented Dec 10, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cgaebel commented Dec 10, 2014

Uh oh!

gereeter commented Dec 10, 2014

Uh oh!

Gankra commented Dec 11, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra commented Dec 11, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huonw commented Dec 11, 2014

Uh oh!

gereeter commented Dec 11, 2014

Uh oh!

Gankra commented Dec 11, 2014

Uh oh!

gereeter commented Dec 11, 2014

Uh oh!

Gankra commented Dec 12, 2014

Uh oh!

cgaebel commented Dec 12, 2014

Uh oh!

gereeter commented Dec 12, 2014