Slightly speed up the `resize_inner` function + documentation of other functions. #451

JustForFun88 · 2023-08-05T19:03:51Z

This speeds up the resize_inner function a bit, since now reading the data from the heap is done not by byte, but by a group. In addition, we may not iterate over all control bytes if we have yielded all indexes of full buckets. For example, on my computer:

Before (with cargo bench):

test grow_insert_ahash_highbits  ... bench:      32,481 ns/iter (+/- 5,315)
test grow_insert_ahash_random    ... bench:      30,122 ns/iter (+/- 5,568)
test grow_insert_ahash_serial    ... bench:      34,267 ns/iter (+/- 8,364)
test grow_insert_std_highbits    ... bench:      54,410 ns/iter (+/- 5,030)
test grow_insert_std_random      ... bench:      52,566 ns/iter (+/- 6,455)
test grow_insert_std_serial      ... bench:      51,681 ns/iter (+/- 9,595)

After (with cargo bench):

test grow_insert_ahash_highbits  ... bench:      27,597 ns/iter (+/- 1,436)
test grow_insert_ahash_random    ... bench:      27,096 ns/iter (+/- 203)
test grow_insert_ahash_serial    ... bench:      27,369 ns/iter (+/- 719)
test grow_insert_std_highbits    ... bench:      50,399 ns/iter (+/- 449)
test grow_insert_std_random      ... bench:      44,495 ns/iter (+/- 7,180)
test grow_insert_std_serial      ... bench:      47,735 ns/iter (+/- 7,598)

As for the documentation, I started with resize_inner and since this function depends on others, I had to document them as well, etc., so there a lot of documentation in total. Also fixed a couple of inaccuracies with marking the function as unsafe.

Fix #453.

src/raw/mod.rs

Zoxc · 2023-08-18T12:58:37Z

src/raw/mod.rs

+///   created will be yielded by that iterator.
+/// - The order in which the iterator yields indices of the buckets is unspecified
+///   and may change in the future.
+pub(crate) struct FullBucketsIndices {


What's the advantage of this over RawIter?

You can't use RawIter since it's a generic struct over T. The changes are made in the method of the RawTableInner struct, which has no information about T type.
At the same time, the use of BitMaskIter inside FullBucketsIndices gives the same acceleration of the iteration of elements as for RawIter.

Could RawIter be built on top of this to avoid code duplication? Assuming that doesn't impact compilation times too much due to the additional layer of inlining needed.

Actually I think this would improve compile times since the code for iteration would only need to be instantiated once.

I was thinking that FullBucketsIndices could be changed to take a size field and thus maintain a pointer to the bucket. There could be some performance impact due to the use of indices that RawIter could be sensitive too.

I'd try these changes on top of this PR though.

I will try 😄. It will be necessary to slightly change the structure of FullBucketsIndices

Oh, that reflect_toggle_full method of RawIter. It causes the FullBucketsIndices size to bloat. Okay, I'll go to bed, maybe tomorrow I'll come up with something adequate 🤔

Zoxc · 2023-08-18T13:11:27Z

src/raw/mod.rs

            self.alloc.clone(),
            table_layout,
            capacity,
            fallibility,
        )?;
-        new_table.growth_left -= self.items;
-        new_table.items = self.items;


Why are these moved?

They are not needed here because:

This allows you to make the function safe.

Adds more consistency in functions, allowing less overthinking and remembering that you changed the number of elements before actually adding them, which is not a good idea in my opinion and only confuses. For example, in the clone_from_impl method, we first add items, and only then change self.table.items and self.table.growth_left fields.

Zoxc · 2023-08-18T13:13:16Z

I tested this PR out with rustc and it does seem to be a performance improvement there:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6201s	1.6077s	-0.77%
🟣 hyper:check	0.2400s	0.2389s	-0.47%
🟣 regex:check	0.9139s	0.9061s	-0.85%
🟣 syn:check	1.5023s	1.4907s	-0.77%
🟣 syntex_syntax:check	5.6779s	5.6401s	-0.67%
Total	9.9542s	9.8835s	-0.71%
Summary	1.0000s	0.9930s	-0.70%

Amanieu · 2023-08-19T17:23:30Z

src/raw/mod.rs

    #[allow(clippy::mut_mut)]
    #[inline]
-    unsafe fn prepare_resize(
+    fn prepare_resize(


It occurs to me that a lot of these methods that mutate data on the heap should probably take &mut self, otherwise there is an additional safety requirement that no other thread is concurrently accessing the data.

I can try to go through all the functions (after the PR), although it is probably won’t be able to do this with all of them.

Specifically, this function can be left as it is (with &self), since it does not change the original table (neither before nor after this pull request).

Amanieu · 2023-08-19T17:30:30Z

src/raw/mod.rs

+///   created will be yielded by that iterator.
+/// - The order in which the iterator yields indices of the buckets is unspecified
+///   and may change in the future.
+pub(crate) struct FullBucketsIndices {


Could RawIter be built on top of this to avoid code duplication? Assuming that doesn't impact compilation times too much due to the additional layer of inlining needed.

JustForFun88 · 2023-08-20T09:58:23Z

@Amanieu I implemented RawIter on top of FullBucketsIndices, could you please review the PR?
Of course, we can go further and try to built RawIterRange on top of FullBucketsIndices too 😄 (although I haven't thought about it yet).

@Zoxc Could you please test this PR with rustc again?

Upd: Squashed all changes into two commits.

Zoxc · 2023-08-20T13:58:23Z

Testing just the RawIter change:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6238s	1.6322s	0.51%
🟣 hyper:check	0.2345s	0.2354s	0.37%
🟣 regex:check	0.9129s	0.9183s	0.59%
🟣 syn:check	1.4699s	1.4747s	0.32%
🟣 syntex_syntax:check	5.6903s	5.7217s	0.55%
Total	9.9315s	9.9821s	0.51%
Summary	1.0000s	1.0047s	0.47%

JustForFun88 · 2023-08-20T15:45:18Z

Testing just the RawIter change:

I think I found the cause of the regression. Can you please repeat?

Zoxc · 2023-08-20T16:33:00Z

Still a regression:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6099s	1.6159s	0.37%
🟣 hyper:check	0.2346s	0.2356s	0.40%
🟣 regex:check	0.8979s	0.9019s	0.45%
🟣 syn:check	1.4505s	1.4569s	0.44%
🟣 syntex_syntax:check	5.6328s	5.6628s	0.53%
Total	9.8257s	9.8730s	0.48%
Summary	1.0000s	1.0044s	0.44%

JustForFun88 · 2023-08-20T19:06:02Z

@Amanieu I give up 😄. As @Zoxc predicted, and then confirmed by benchmarking, the implementation of RawIter through FullBucketsIndices leads to regression. I don't think this can be changed, because in fact one operation bucket.next_n(index) is replaced by:

let data_index = group_first_index + index;
bucket.next_n(data_index)

That is, each time there is an additional add operation, which leads to a small but measurable regression.

To reduce the code itself, we can, of course, perform some sort of dispatching through traits and generics, like the example below, but this will likely slow down compilation and, in essence, will be equivalent to two different structures:

trait GroupBase {
    fn next_n(&self, offset: usize) -> Self;
}

impl GroupBase for usize {
    fn next_n(&self, offset: usize) -> Self {
        self + offset
    }
}

impl<T> GroupBase for Bucket<T> {
    fn next_n(&self, offset: usize) -> Self {
        unsafe { self.next_n(offset) }
    }
}

pub(crate) struct FullBucketsIndices<B: GroupBase> {
    current_group: BitMaskIter,
    group_base: B,
    // Pointer to the current group of control bytes,
    ctrl: NonNull<u8>,
    // Buckets number in given subset of a table.
    buckets: usize,
    // Number of elements in given subset of a table.
    items: usize,
}

Amanieu · 2023-08-22T19:31:59Z

That's fine, let's just switch back to the previous implementation of RawIter then.

JustForFun88 · 2023-08-26T05:58:52Z

That's fine, let's just switch back to the previous implementation of RawIter then.

Returned the previous implementation. However, the tests are failing, it looks like an upstrem problem rust-lang/rust#115239

Amanieu · 2023-08-26T21:19:10Z

LGTM. Let's wait until CI is resolved before merging.

Amanieu · 2023-08-27T10:04:47Z

@bors r+

bors · 2023-08-27T10:04:48Z

📌 Commit 7e45a78 has been approved by Amanieu

It is now in the queue for this repository.

bors · 2023-08-27T10:04:55Z

⌛ Testing commit 7e45a78 with merge 2a10eac...

bors · 2023-08-27T10:18:54Z

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing 2a10eac to master...

Change `&` to `&mut` where applicable This addresses #451 (comment). All remaining functions either return raw pointers or do nothing with the data on the heap.

Zoxc reviewed Aug 18, 2023

View reviewed changes

src/raw/mod.rs Outdated Show resolved Hide resolved

Zoxc reviewed Aug 18, 2023

View reviewed changes

src/raw/mod.rs Outdated Show resolved Hide resolved

Zoxc reviewed Aug 18, 2023

View reviewed changes

Amanieu reviewed Aug 19, 2023

View reviewed changes

Improve resize_inner and doc some functions

39d5838

JustForFun88 force-pushed the speed_up_resize_inner branch from 5d4bbee to 1d83af3 Compare August 20, 2023 12:59

JustForFun88 force-pushed the speed_up_resize_inner branch from 1d83af3 to 3e7e253 Compare August 20, 2023 14:06

Fix mistakes

7e45a78

JustForFun88 force-pushed the speed_up_resize_inner branch from 3e7e253 to 7e45a78 Compare August 26, 2023 05:22

bors mentioned this pull request Aug 27, 2023

Relax MSRV to 1.63.0 #457

Merged

bors merged commit 2a10eac into rust-lang:master Aug 27, 2023
4 of 24 checks passed

bors mentioned this pull request Aug 27, 2023

Update RawTable with fallible APIs #459

Closed

JustForFun88 deleted the speed_up_resize_inner branch August 27, 2023 10:46

JustForFun88 mentioned this pull request Aug 27, 2023

Change & to &mut where applicable #464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slightly speed up the `resize_inner` function + documentation of other functions. #451

Slightly speed up the `resize_inner` function + documentation of other functions. #451

JustForFun88 commented Aug 5, 2023 •

edited

Loading

Zoxc Aug 18, 2023

JustForFun88 Aug 18, 2023

Amanieu Aug 19, 2023

Amanieu Aug 19, 2023

Zoxc Aug 19, 2023

JustForFun88 Aug 19, 2023 •

edited

Loading

JustForFun88 Aug 19, 2023

Zoxc Aug 18, 2023

JustForFun88 Aug 18, 2023

Zoxc commented Aug 18, 2023

Amanieu Aug 19, 2023

JustForFun88 Aug 20, 2023

Amanieu Aug 19, 2023

JustForFun88 commented Aug 20, 2023 •

edited

Loading

Zoxc commented Aug 20, 2023

JustForFun88 commented Aug 20, 2023

Zoxc commented Aug 20, 2023

JustForFun88 commented Aug 20, 2023

Amanieu commented Aug 22, 2023

JustForFun88 commented Aug 26, 2023

Amanieu commented Aug 26, 2023

Amanieu commented Aug 27, 2023

bors commented Aug 27, 2023

bors commented Aug 27, 2023

bors commented Aug 27, 2023

Slightly speed up the resize_inner function + documentation of other functions. #451

Slightly speed up the resize_inner function + documentation of other functions. #451

Conversation

JustForFun88 commented Aug 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JustForFun88 Aug 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zoxc commented Aug 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JustForFun88 commented Aug 20, 2023 • edited Loading

Zoxc commented Aug 20, 2023

JustForFun88 commented Aug 20, 2023

Zoxc commented Aug 20, 2023

JustForFun88 commented Aug 20, 2023

Amanieu commented Aug 22, 2023

JustForFun88 commented Aug 26, 2023

Amanieu commented Aug 26, 2023

Amanieu commented Aug 27, 2023

bors commented Aug 27, 2023

bors commented Aug 27, 2023

bors commented Aug 27, 2023

Slightly speed up the `resize_inner` function + documentation of other functions. #451

Slightly speed up the `resize_inner` function + documentation of other functions. #451

JustForFun88 commented Aug 5, 2023 •

edited

Loading

JustForFun88 Aug 19, 2023 •

edited

Loading

JustForFun88 commented Aug 20, 2023 •

edited

Loading