-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for HashMap::raw_entry #56167
Comments
What is the motivation for having separate Could we consider merging these methods into a single one? Or is there some use case where the difference in behavior is useful? |
I am also extremely confused by this distinction, as my original designs didn't include them (I think?) and the documentation that was written is very unclear. |
cc @fintelia |
The reason I added let key = map.iter().nth(rand() % map.len()).0.clone();
map.remove(&key); I wanted to just be able to pick a random "bucket" and then get an entry/raw entry to the first element in it if any: loop {
if let Occupied(o) = map.raw_entry_mut().search_bucket(rand(), || true) {
o.remove();
break;
}
} (the probabilities aren't uniform in the second version, but close enough for my purposes) |
I continue to not want to support the "random deletion" usecase in std's HashMap. You really, really, really, should be using a linked hashmap or otherwise ordered map for that. |
I have removed this method in the hashbrown PR (#56241). Your code snippet for random deletion won't work with hashbrown anyways since it always checks the hash as part of the search process. |
It doesn't work in hashbrown anyways (see rust-lang#56167)
I can avoid unnecessary clones inherent to the original #![feature(hash_raw_entry)]
use std::collections::HashMap;
let mut map = HashMap::new();
map.raw_entry_mut()
.from_key("poneyland")
.or_insert("poneyland", 3); Currently I use the following function to hash once and automatically provide an owned key if necessary (somewhat similar to what was discussed in rust-lang/rfcs#1769): use std::borrow::Borrow;
use std::collections::hash_map::RawEntryMut;
use std::hash::{BuildHasher, Hash, Hasher};
fn get_mut_or_insert_with<'a, K, V, Q, F>(
map: &'a mut HashMap<K, V>,
key: &Q,
default: F,
) -> &'a mut V
where
K: Eq + Hash + Borrow<Q>,
Q: Eq + Hash + ToOwned<Owned = K>,
F: FnOnce() -> V,
{
let mut hasher = map.hasher().build_hasher();
key.hash(&mut hasher);
let hash = hasher.finish();
match map.raw_entry_mut().from_key_hashed_nocheck(hash, key) {
RawEntryMut::Occupied(entry) => entry.into_mut(),
RawEntryMut::Vacant(entry) => {
entry
.insert_hashed_nocheck(hash, key.to_owned(), default())
.1
}
}
} Given If there isn't, why not saving the hash in |
I'm not yet very familiar with this API, but what @gdouezangrard suggested seems like a great idea to me. What even happens currently if the two hashes don't match, is the key-value pair then inserted into the wrong bucket? It's not clear to me from (quickly) reading the source code. |
I submitted rust-lang/hashbrown#54 to support using a If so, I'd be happy to submit a PR! |
This is a really great API, it's also what keeps crates ( What could be next steps here towards stabilization? |
Just gonna add another ping here -- what's blocking this right now? |
I see a few things that need to be resolved:
I would recommend prototyping in the hashbrown crate first, which can then be ported back in the the std HashMap. |
I find I also would like to point out that #![feature(hash_raw_entry)]
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.raw_entry_mut().from_key(&42).or_insert(1, 2);
println!("{}", map[&1]);
} This is a bit like calling #![feature(hash_raw_entry)]
use std::collections::hash_map::{HashMap, RawEntryMut};
fn main() {
let mut map = HashMap::new();
if let RawEntryMut::Vacant(_) = map.raw_entry_mut().from_key(&42) {
map.insert(1, 2);
}
println!("{}", map[&1]);
} I think raw entry API is useful, but I don't think its API should be conflated with entry API. |
As discussed here: rust-lang/hashbrown#232
If the feature of a user specified hash is needed, it may be useful to instead provide a method on the raw entry to hash a key. That way the hashmap can implement this however it sees fit and the application code is less error prone because there is an unambiguous way to obtain the hash value if it is not known in advance. |
The details of As for binary bloat, the concern you cited is why Rust allows function merging. If the implementations of two codepaths happen to be the same, their functions will simply point to the same code. |
Function merging happens across crates with different versions? Is it specific to functions with the same name from the same crate name or is there some kind of search algorithm to find all functions that are the same (eg if I copy paste a function to my crate, does it get deduped as well?). |
LLVM's |
I think that's the part I was trying to highlight - the odds of two implementations of hash table being in the same compilation unit is low and you're only going to see this maybe get resolved if you use full LTO whereas most people at most use thinLTO. I think code bloat is a valid concern and one that's not easily dismissed by "the optimizer will handle it" |
Much of it will be duplicated across CUs anyway by monomorphization and/or |
…k-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (rust-lang#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.
…k-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (rust-lang#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.
…k-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (rust-lang#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.
…k-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (rust-lang#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.
Rollup merge of rust-lang#137741 - cuviper:const_str-raw_entry, r=Mark-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (rust-lang#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Rollup merge of rust-lang#137701 - cuviper:sharded-hashtable, r=fmease Convert `ShardedHashMap` to use `hashbrown::HashTable` The `hash_raw_entry` feature (rust-lang#56167) has finished fcp-close, so the compiler should stop using it to allow its removal. Several `Sharded` maps were using raw entries to avoid re-hashing between shard and map lookup, and we can do that with `hashbrown::HashTable` instead.
Rollup merge of rust-lang#138425 - cuviper:remove-hash_raw_entry, r=jhpratt Remove `feature = "hash_raw_entry"` The `hash_raw_entry` feature finished [fcp-close](rust-lang#56167 (comment)) back in August, and its remaining uses in the compiler have now been removed, so we should be all clear to remove it from `std`. Closes rust-lang#56167
Remove `feature = "hash_raw_entry"` The `hash_raw_entry` feature finished [fcp-close](rust-lang/rust#56167 (comment)) back in August, and its remaining uses in the compiler have now been removed, so we should be all clear to remove it from `std`. Closes #56167
…hpratt Remove `feature = "hash_raw_entry"` The `hash_raw_entry` feature finished [fcp-close](rust-lang#56167 (comment)) back in August, and its remaining uses in the compiler have now been removed, so we should be all clear to remove it from `std`. Closes rust-lang#56167
Added in #54043.
As of 6ecad33 / 2019-01-09, this feature covers:
… as well as
Debug
impls for each 5 new types, and their inherent methods.The text was updated successfully, but these errors were encountered: