Skip to content

Conversation

@Urgau
Copy link
Member

@Urgau Urgau commented Jun 29, 2024

This PR changes SipHasher128 and StableHasher so that their finish implementation are no longer fatal and return a "small hash".

Implements #3 (comment)
r? @michaelwoerister

@Urgau Urgau requested a review from michaelwoerister July 3, 2024 07:16
}

#[inline]
pub fn finish128(mut self) -> [u64; 2] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an idea if this affects performance?

Copy link
Member Author

@Urgau Urgau Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted finish128 method on Godbolt and look at the diff with the previous version and there was significant increase in "predicted cycles" and instructions, which was due to the ptr::write_bytes call generating a memcpy instead of a small instructions because the size wasn't a constant anymore.

I therefore changed that part (Godbolt) to copy the "last" and "last + 1" in a array as to have the size a constant as before and now the diff is very small, llvm-mca indicates 15100ins -> 15400ins for 100 iterations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating!

Since this code is super hot in the compiler and the u64 version of the finish method probably isn't going to be used much, I'd prefer if we didn't regress things at all.

How about extracting a finish128_inner(nbuf: usize, buf: &mut [MaybeUninit<u64>; BUFFER_WITH_SPILL_CAPACITY], state: State, processed: usize) function? Then Hasher::finish could would only need to copy state and finish128 could keep taking self by value (?_

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the generated assembly on Godbolt and it's very very similar, mainly some registry naming changes, some instructions moving place and one more stack variable.

Compared to the other version (with the last and last+1 copy), it's roughly similar, in terms of instructions per llvm-mca.

One interesting thing, with the inner variant the number of cycles needed per llvm-mca is dropping significantly from 4410 to 3810, maybe because of less memory pressure. That seems like a win.

Pushed the inner variant.

Copy link
Member

@michaelwoerister michaelwoerister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I'm wondering if we should have some kind of performance testing as part of this repo. But I'm not sure how to do that in a stable, useful way.

@Urgau Urgau force-pushed the finish-non-fatal branch from 3a72324 to ddced90 Compare July 3, 2024 11:06
@Urgau Urgau force-pushed the finish-non-fatal branch from ddced90 to 3c1f1fc Compare July 3, 2024 12:43

fn finish(&self) -> u64 {
panic!("SipHasher128 cannot provide valid 64 bit hashes")
let mut buf = self.buf.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we still need to clone the buffer 🤷
Hasher::finish is just defined in an unfortunate way.

@michaelwoerister michaelwoerister merged commit c2b3deb into rust-lang:main Jul 3, 2024
@michaelwoerister
Copy link
Member

Thank you, @Urgau!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants