Skip to content

Commit

Permalink
Add complexity estimation of iterating over HashSet and HashMap
Browse files Browse the repository at this point in the history
It is not obvious (at least for me) that complexity of iteration over hash tables depends on capacity and not length. Especially comparing with other containers like Vec or String. I think, this behaviour is worth mentioning.

I run benchmark which tests iteration time for maps with length 50 and different capacities and get this results:
```
capacity - time
64       - 203.87 ns
256      - 351.78 ns
1024     - 607.87 ns
4096     - 965.82 ns
16384    - 3.1188 us
```

If you want to dig why it behaves such way, you can look current implementation in [hashbrown code](https://github.com/rust-lang/hashbrown/blob/f3a9f211d06f78c5beb81ac22ea08fdc269e068f/src/raw/mod.rs#L1933).

Benchmarks code would be presented in PR related to this commit.
  • Loading branch information
AngelicosPhosphoros committed May 20, 2022
1 parent cd73afa commit de97d73
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 0 deletions.
40 changes: 40 additions & 0 deletions library/std/src/collections/hash/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// println!("{key}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over keys takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[stable(feature = "rust1", since = "1.0.0")]
pub fn keys(&self) -> Keys<'_, K, V> {
Keys { inner: self.iter() }
Expand All @@ -370,6 +375,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// vec.sort_unstable();
/// assert_eq!(vec, ["a", "b", "c"]);
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over keys takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[inline]
#[rustc_lint_query_instability]
#[stable(feature = "map_into_keys_values", since = "1.54.0")]
Expand All @@ -395,6 +405,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// println!("{val}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over values takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[stable(feature = "rust1", since = "1.0.0")]
pub fn values(&self) -> Values<'_, K, V> {
Values { inner: self.iter() }
Expand Down Expand Up @@ -422,6 +437,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// println!("{val}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over values takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[stable(feature = "map_values_mut", since = "1.10.0")]
pub fn values_mut(&mut self) -> ValuesMut<'_, K, V> {
ValuesMut { inner: self.iter_mut() }
Expand All @@ -448,6 +468,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// vec.sort_unstable();
/// assert_eq!(vec, [1, 2, 3]);
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over values takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[inline]
#[rustc_lint_query_instability]
#[stable(feature = "map_into_keys_values", since = "1.54.0")]
Expand All @@ -473,6 +498,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// println!("key: {key} val: {val}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over map takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[rustc_lint_query_instability]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn iter(&self) -> Iter<'_, K, V> {
Expand Down Expand Up @@ -503,6 +533,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// println!("key: {key} val: {val}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over map takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[rustc_lint_query_instability]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn iter_mut(&mut self) -> IterMut<'_, K, V> {
Expand Down Expand Up @@ -633,6 +668,11 @@ impl<K, V, S> HashMap<K, V, S> {
/// map.retain(|&k, _| k % 2 == 0);
/// assert_eq!(map.len(), 4);
/// ```
///
/// # Performance
///
/// In the current implementation, this operation takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[inline]
#[rustc_lint_query_instability]
#[stable(feature = "retain_hash_collection", since = "1.18.0")]
Expand Down
10 changes: 10 additions & 0 deletions library/std/src/collections/hash/set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,11 @@ impl<T, S> HashSet<T, S> {
/// println!("{x}");
/// }
/// ```
///
/// # Performance
///
/// In the current implementation, iterating over set takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[inline]
#[rustc_lint_query_instability]
#[stable(feature = "rust1", since = "1.0.0")]
Expand Down Expand Up @@ -312,6 +317,11 @@ impl<T, S> HashSet<T, S> {
/// set.retain(|&k| k % 2 == 0);
/// assert_eq!(set.len(), 3);
/// ```
///
/// # Performance
///
/// In the current implementation, this operation takes O(capacity) time
/// instead of O(len) because it internally visits empty buckets too.
#[rustc_lint_query_instability]
#[stable(feature = "retain_hash_collection", since = "1.18.0")]
pub fn retain<F>(&mut self, f: F)
Expand Down

0 comments on commit de97d73

Please sign in to comment.