-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document seemingly unneeded self.clone in write #34
Conversation
@@ -116,20 +116,19 @@ impl Hasher for FxHasher { | |||
const _: () = assert!(size_of::<usize>() <= size_of::<u64>()); | |||
// Ensure no bytes are discarded by casting to usize | |||
const _: () = assert!(size_of::<u32>() <= size_of::<usize>()); | |||
let mut state = self.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calebsander: Why did you add this clone in #12?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case they don't answer, my two guesses are:
- It is a leftover of the refactoring that is indeed superfluous
- It is supposed to prevent committing "partially hashed" slices of bytes to the hasher in case of a panic in
add_to_hash
. But a) I don't see currently a way for this to panic and b) I don't think it would make much sense to try to recover from a panic anyway, especially as this invariant is not documented anywhere.
Thanks for reviewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the generated assembly. If I remember correctly, cloning (which is a single 64-bit load, and later a 64-bit store to save the state) allowed the hash accumulation to happen in registers. Accumulating directly on self
meant the bitwise operations and multiplication were applied to a memory location. I figured avoiding memory accesses in hot loops (even if they are very cache-friendly) was a good idea. But it's also likely that the Hasher
methods will be inlined anyways, with the caller storing the hash state in a local variable, so the state may well be in a register rather than memory in the first place.
It's entirely possible that the compiler no longer behaves this way, or that it doesn't really matter for performance. Adding a benchmark to give some concrete data would be much better than my speculating.
@calebsander Thanks for the explanation. It seems that the codegen is indeed suboptimal without the clone. I've added a comment explaining why the clone is here and opened rust-lang/rust#123129 about the suboptimal codegen. |
I don't think the clone is needed.