Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Stabilize custom hasher support in HashMap #31081

Merged
merged 1 commit into from
Jan 26, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions src/libcore/hash/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@

use prelude::v1::*;

use marker;
use mem;

#[stable(feature = "rust1", since = "1.0.0")]
Expand Down Expand Up @@ -190,6 +191,77 @@ pub trait Hasher {
}
}

/// A `BuildHasher` is typically used as a factory for instances of `Hasher`
/// which a `HashMap` can then use to hash keys independently.
///
/// Note that for each instance of `BuildHasher` the create hashers should be
/// identical. That is if the same stream of bytes is fed into each hasher the
/// same output will also be generated.
#[stable(since = "1.7.0", feature = "build_hasher")]
pub trait BuildHasher {
/// Type of the hasher that will be created.
#[stable(since = "1.7.0", feature = "build_hasher")]
type Hasher: Hasher;

/// Creates a new hasher.
#[stable(since = "1.7.0", feature = "build_hasher")]
fn build_hasher(&self) -> Self::Hasher;
}

/// A structure which implements `BuildHasher` for all `Hasher` types which also
/// implement `Default`.
///
/// This struct is 0-sized and does not need construction.
#[stable(since = "1.7.0", feature = "build_hasher")]
pub struct BuildHasherDefault<H>(marker::PhantomData<H>);

#[stable(since = "1.7.0", feature = "build_hasher")]
impl<H: Default + Hasher> BuildHasher for BuildHasherDefault<H> {
type Hasher = H;

fn build_hasher(&self) -> H {
H::default()
}
}

#[stable(since = "1.7.0", feature = "build_hasher")]
impl<H> Clone for BuildHasherDefault<H> {
fn clone(&self) -> BuildHasherDefault<H> {
BuildHasherDefault(marker::PhantomData)
}
}

#[stable(since = "1.7.0", feature = "build_hasher")]
impl<H> Default for BuildHasherDefault<H> {
fn default() -> BuildHasherDefault<H> {
BuildHasherDefault(marker::PhantomData)
}
}

// The HashState trait is super deprecated, but it's here to have the blanket
// impl that goes from HashState -> BuildHasher

/// Deprecated, renamed to `BuildHasher`
#[unstable(feature = "hashmap_hasher", reason = "hasher stuff is unclear",
issue = "27713")]
#[rustc_deprecated(since = "1.7.0", reason = "support moved to std::hash and \
renamed to BuildHasher")]
pub trait HashState {
/// Type of the hasher that will be created.
type Hasher: Hasher;

/// Creates a new hasher based on the given state of this object.
fn hasher(&self) -> Self::Hasher;
}

#[unstable(feature = "hashmap_hasher", reason = "hasher stuff is unclear",
issue = "27713")]
#[allow(deprecated)]
impl<T: HashState> BuildHasher for T {
type Hasher = T::Hasher;
fn build_hasher(&self) -> T::Hasher { self.hasher() }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why we take &self here, given that you're describing the trait in terms of being state that's used to generate new hashers? We seem to force you to interior mutability here, presumably for good reason?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each T::Hasher has to have the same initial state - otherwise hashing the same thing twice wouldn't have the same hash. When this function is called, it shouldn't be doing a whole lot.

For example a factory constructed with a random seed. You don't see the non-random seed in my code because it's piggybacking on the implementation provided by DefaultState

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think the problem is just that the docs here are very misleading:

  • "A trait representing stateful hashes ..."
  • "Creates a new hasher based on the given state of this object"

Both of those strongly suggested to me that mutation is expected here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually the HashState trait is basically dead/to be ignored, the relevant documentation on the BuildHasher trait (the real one now) is:

/// A trait representing an object which maintains state to create new instances
/// of the `Hasher` trait.                                                      
///                                                                             
/// A `BuildHasher` is typically used as a factory for instances of `Hasher`    
/// which a `HashMap` can then use to hash keys independently.                  

Does that sound ok?

I'll clear out the old documentation in favor of just saying it's deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are the docs I was referencing :) Look above -- the BuildHasher trait method docs say "Creates a new hasher based on the given state of this object", which is what I was quoting. My point is just that these docs strongly suggest mutable state to me. Not a huge deal, but I found it pretty confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok I think I see what you're getting at now. Do you have a rewording in mind? I could try and wordsmith a bit as well and see what I can come up with

}

//////////////////////////////////////////////////////////////////////////////

mod impls {
Expand Down
1 change: 0 additions & 1 deletion src/librustc/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
#![feature(collections)]
#![feature(const_fn)]
#![feature(enumset)]
#![feature(hashmap_hasher)]
#![feature(iter_arith)]
#![feature(libc)]
#![feature(nonzero)]
Expand Down
5 changes: 2 additions & 3 deletions src/librustc/util/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,9 @@

use std::cell::{RefCell, Cell};
use std::collections::HashMap;
use std::collections::hash_state::HashState;
use std::ffi::CString;
use std::fmt::Debug;
use std::hash::Hash;
use std::hash::{Hash, BuildHasher};
use std::iter::repeat;
use std::path::Path;
use std::time::Instant;
Expand Down Expand Up @@ -217,7 +216,7 @@ pub trait MemoizationMap {
}

impl<K, V, S> MemoizationMap for RefCell<HashMap<K,V,S>>
where K: Hash+Eq+Clone, V: Clone, S: HashState
where K: Hash+Eq+Clone, V: Clone, S: BuildHasher
{
type Key = K;
type Value = V;
Expand Down
11 changes: 5 additions & 6 deletions src/librustc_data_structures/fnv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,20 @@
// except according to those terms.

use std::collections::{HashMap, HashSet};
use std::collections::hash_state::DefaultState;
use std::default::Default;
use std::hash::{Hasher, Hash};
use std::hash::{Hasher, Hash, BuildHasherDefault};

pub type FnvHashMap<K, V> = HashMap<K, V, DefaultState<FnvHasher>>;
pub type FnvHashSet<V> = HashSet<V, DefaultState<FnvHasher>>;
pub type FnvHashMap<K, V> = HashMap<K, V, BuildHasherDefault<FnvHasher>>;
pub type FnvHashSet<V> = HashSet<V, BuildHasherDefault<FnvHasher>>;

#[allow(non_snake_case)]
pub fn FnvHashMap<K: Hash + Eq, V>() -> FnvHashMap<K, V> {
Default::default()
HashMap::default()
}

#[allow(non_snake_case)]
pub fn FnvHashSet<V: Hash + Eq>() -> FnvHashSet<V> {
Default::default()
HashSet::default()
}

/// A speedy hash algorithm for node ids and def ids. The hashmap in
Expand Down
1 change: 0 additions & 1 deletion src/librustc_data_structures/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
html_favicon_url = "https://www.rust-lang.org/favicon.ico",
html_root_url = "https://doc.rust-lang.org/nightly/")]

#![feature(hashmap_hasher)]
#![feature(nonzero)]
#![feature(rustc_private)]
#![feature(staged_api)]
Expand Down
15 changes: 7 additions & 8 deletions src/libserialize/collection_impls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@

//! Implementations of serialization for structures found in libcollections

use std::hash::Hash;
use std::collections::hash_state::HashState;
use std::hash::{Hash, BuildHasher};
use std::mem;

use {Decodable, Encodable, Decoder, Encoder};
Expand Down Expand Up @@ -159,7 +158,7 @@ impl<
impl<K, V, S> Encodable for HashMap<K, V, S>
where K: Encodable + Hash + Eq,
V: Encodable,
S: HashState,
S: BuildHasher,
{
fn encode<E: Encoder>(&self, e: &mut E) -> Result<(), E::Error> {
e.emit_map(self.len(), |e| {
Expand All @@ -177,12 +176,12 @@ impl<K, V, S> Encodable for HashMap<K, V, S>
impl<K, V, S> Decodable for HashMap<K, V, S>
where K: Decodable + Hash + Eq,
V: Decodable,
S: HashState + Default,
S: BuildHasher + Default,
{
fn decode<D: Decoder>(d: &mut D) -> Result<HashMap<K, V, S>, D::Error> {
d.read_map(|d, len| {
let state = Default::default();
let mut map = HashMap::with_capacity_and_hash_state(len, state);
let mut map = HashMap::with_capacity_and_hasher(len, state);
for i in 0..len {
let key = try!(d.read_map_elt_key(i, |d| Decodable::decode(d)));
let val = try!(d.read_map_elt_val(i, |d| Decodable::decode(d)));
Expand All @@ -195,7 +194,7 @@ impl<K, V, S> Decodable for HashMap<K, V, S>

impl<T, S> Encodable for HashSet<T, S>
where T: Encodable + Hash + Eq,
S: HashState,
S: BuildHasher,
{
fn encode<E: Encoder>(&self, s: &mut E) -> Result<(), E::Error> {
s.emit_seq(self.len(), |s| {
Expand All @@ -211,12 +210,12 @@ impl<T, S> Encodable for HashSet<T, S>

impl<T, S> Decodable for HashSet<T, S>
where T: Decodable + Hash + Eq,
S: HashState + Default,
S: BuildHasher + Default,
{
fn decode<D: Decoder>(d: &mut D) -> Result<HashSet<T, S>, D::Error> {
d.read_seq(|d, len| {
let state = Default::default();
let mut set = HashSet::with_capacity_and_hash_state(len, state);
let mut set = HashSet::with_capacity_and_hasher(len, state);
for i in 0..len {
set.insert(try!(d.read_seq_elt(i, |d| Decodable::decode(d))));
}
Expand Down
1 change: 0 additions & 1 deletion src/libserialize/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ Core encoding and decoding interfaces.
#![feature(box_syntax)]
#![feature(collections)]
#![feature(enumset)]
#![feature(hashmap_hasher)]
#![feature(rustc_private)]
#![feature(staged_api)]
#![feature(str_char)]
Expand Down
Loading