Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for BuildHasher::hash_one #86161

Closed
2 of 4 tasks
scottmcm opened this issue Jun 9, 2021 · 10 comments
Closed
2 of 4 tasks

Tracking Issue for BuildHasher::hash_one #86161

scottmcm opened this issue Jun 9, 2021 · 10 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Milestone

Comments

@scottmcm
Copy link
Member

scottmcm commented Jun 9, 2021

Feature gate: #![feature(build_hasher_simple_hash_one)]

This is a tracking issue for the hash_one method on the BuildHasher trait. This is a convenience method to make it easier to get the hash of a value, such as when implementing a hash table or when writing unit tests.

Public API

pub trait BuildHasher { // existing trait
    fn hash_one<T: Hash>(&self, x: T) -> u64
    where
        Self: Sized; // new method, with provided implementation
}

Steps / History

Unresolved Questions

  • None yet.
@scottmcm scottmcm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Jun 9, 2021
@dtolnay
Copy link
Member

dtolnay commented Aug 18, 2021

Added where Self: Sized in #88031 in order to preserve object safety of BuildHasher.

@thomcc
Copy link
Member

thomcc commented Sep 10, 2021

This API is unexpectedly useful for performance.

There are certain hash algorithms which can provide a (substantially) more efficient implementation if they know they aren't being used in a streaming scenario, and that they just need to hash a single input (such as a range of bytes).

xxHash is one of these — the streaming API is very much set up for the "hash a big file" use case, and has a large flat overhead to initialize the state for computing the streaming hash. That is, in a direct port, BuildHasher::build_hasher is very slow — much slower than the actual hashing for small keys.

However, it has a separate API for hashing a chunk of memory (with an optional seed), which you're supposed to use for hashtable keys and such. This API produces the same results as streaming (it wouldn't be much of a hash if this weren't the case), but is quite fast... far faster if this is all you need and you have small strings and such, as are common for hash-table keys.

I had some plans on how to subvert the issues here (by doing a less-direct port)... But I don't need to if this API exists (and is used in HashMap), so long as I provide a bunch of specializations on T (a pain, but worth it here).

I dunno if this was completely unintentional, or just not mentioned, but I had considered this basically unsolvable the way Rust's hashing APIs were designed[1], so it's nice to be wrong.

I'll try to get benchmark numbers for this soonish.


[1]: I figured that we had traded perf for flexibility here, and that it was... a bit unfortunate, but probably worth it given that Rust's hashing APIs are far more convenient (and far less error-prone) than the ones in other langs which are hashcode-based.

bors added a commit to rust-lang/hashbrown that referenced this issue Sep 10, 2021
Use `BuildHasher::hash_one` when `feature = "nightly"` is enabled.

I describe why this is good for more than just convenience here: rust-lang/rust#86161 (comment)
@tkaitchuck
Copy link
Contributor

Yes. This could give a significant performance gain particularly if the implementation was not held to the requirement that the resulting output be identical to what would have resulted by passing the value into the hasher and then calling finish(). This means that hashers can take advantage of knowing that all the data is present and more is not coming to reduce the amount of work done.

@thomcc
Copy link
Member

thomcc commented Nov 7, 2021

I think it needs to return the same value, otherwise things like Borrow won't work. Users currently can implement Hash for their own types in terms of hashing a nested type, and have their Borrow implementation be correct. I suspect there's other things like this too.

I also think it would be very surprising if it returned different values, and suspect that it still can be done as efficiently in most cases, just more work to derive an efficient "closed-form" version of the computation.

@Amanieu
Copy link
Member

Amanieu commented Apr 11, 2023

I think this is a generally useful helper function that makes it easier for code to work with hashers. In particular, it replaces this rather cumbersome code for hashing a value:

fn compute_hash<K: Hash + ?Sized, S: BuildHasher>(hash_builder: &S, key: &K) -> u64 {
    let mut state = hash_builder.build_hasher();
    key.hash(&mut state);
    state.finish()
}

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Apr 11, 2023

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Apr 11, 2023
@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels May 14, 2023
@rfcbot
Copy link

rfcbot commented May 14, 2023

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels May 24, 2023
@rfcbot
Copy link

rfcbot commented May 24, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label May 25, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue May 27, 2023
saethlin pushed a commit to saethlin/miri that referenced this issue May 29, 2023
@JohnTitor
Copy link
Member

Triage: Closing as it was stabilized already.

@oulaoba
Copy link

oulaoba commented Jan 10, 2024

@lucas7788

Error reason

The latest release of ahash uses build_hasher_simple_hash_one which was stabilized in 1.71 but the latest rustc version from the Solana tools is 1.68.0.

You can fix it by :

[dependencies]
ahash = "=0.8.4"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants