Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer ordered collections (BTreeMap / BTreeSet) over hash based collections #33

Closed
wants to merge 0 commits into from

Conversation

rklaehn
Copy link

@rklaehn rklaehn commented Jul 24, 2020

This is a matter of preference/taste. So take it or leave it :-)

  • Order based collections in general are not slower than hash based collections for the cases we are looking at here
    see e.g. https://users.rust-lang.org/t/hashmap-vs-btreemap/13804
    for peer ids, the Ord instance is very fast since they are essentially random and therefore will differ in some way in the first few bytes
    to get best performance out of rust hash based collections, you need to use a quick, deterministic hasher like https://docs.rs/fnv/1.0.7/fnv/struct.FnvHasher.html . But then you are open to hash collision attacks again. Using ordered collections completely avoids the problem and is IMHO much simpler.
  • Hash-based collections with the default hasher intentionally introduce non-determinism (see https://doc.rust-lang.org/beta/std/collections/hash_map/struct.RandomState.html ) to prevent hash collision attacks. However, I think in distributed systems you already have enough nondeterminism, you don't need to add some for completely deterministic things
  • Hash-based collections cause things like debug output to be non-deterministic, complicating debugging

@rklaehn rklaehn force-pushed the rkl/ordered-collections branch 2 times, most recently from f2e7dd0 to 1834c5b Compare July 24, 2020 09:58
for (peer, controls) in self.control_pool.drain().collect::<Vec<_>>() {
let mut old = BTreeMap::new();
std::mem::swap(&mut self.control_pool, &mut old);
for (peer, controls) in old.into_iter() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because BTreeMap does not have drain(), so basically manual drain.

fanout.remove(&topic_hash);
return false;
let now = Instant::now();
let expired_topics = self.fanout_last_pub.iter().filter_map(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This two stage approach is necessary because BTreeMap does not yet have retain. It is in the works, but for now we first have to figure out what to remove, and then actually remove it.

For the probably quite common case where nothing is removed, this does not allocate anything.

@rklaehn
Copy link
Author

rklaehn commented Jul 24, 2020

Not sure why ci is failing. Ran tests and doc locally without issue. Might be a fluke.

@rklaehn rklaehn force-pushed the rkl/ordered-collections branch from fe76e95 to f9cfc8d Compare July 24, 2020 13:40
@rklaehn rklaehn closed this Jul 24, 2020
@rklaehn rklaehn force-pushed the rkl/ordered-collections branch from 0de44d7 to 7b76c8f Compare July 24, 2020 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant