-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement TreeBins #72
Conversation
Hm, is something up with CI? I wanted to start this as a draft to check ASAN, but CI seemed unhappy to start so I marked this as ready. Still doesn't seem to get going :/ |
Hmm, could you try rebasing on master and force-pushing? |
624de8c
to
959634a
Compare
Aha! There seems to be some remaining issues, but at least I can address them now! |
This comment has been minimized.
This comment has been minimized.
6056a07
to
6f3f51c
Compare
Managed to figure it out in my remaining time this morning before I have to leave. It was the re-used bins that leaked memory, just had to account for the fact that additionally, values are always re-used in the new table and cannot be dropped both with the old and the new bin. miri seems to be unavailable again, but the other checks all pass now, so this should now actually be ready for review 😊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me! I've left a number of inline comments that should be quick to resolve. Only one is critical (a missed case in the red-black tree implementation). I have no idea whether it'll be easier to follow this review on GitHub (I think reviewable just posts a giant comment) or on Reviewable directly. Hope it's not too painful!
Reviewed 3 of 6 files at r1, 6 of 6 files at r2.
Reviewable status: all files reviewed, 85 unresolved discussions (waiting on @domenicquirl)
src/lib.rs, line 59 at r2 (raw file):
//! exactly the same [`Hash`](std::hash::Hash) value is a sure way to slow down performance of any //! hash table. To ameliorate impact, keys are required to be [`Ord`](std::cmp::Ord), which may //! be used by the map to help break ties.
This seems somewhat misleading to me. We don't really use it to break ties, we use it to more efficiently handle large number of keys that hash to the same bucket. I know this is a relatively straightforward paraphrasing of the Java docs, but I think we can do better here.
src/lib.rs, line 64 at r2 (raw file):
//! TODO: dynamic load factor //! */ //! # Set projections
HashSet
is not quite the same as what the Java code refers to as "set projection". What they are referring to is a set view of the map (which essentially gives you just the keys) as demonstrated here:
flurry/jsr166/src/ConcurrentHashMap.java
Line 1207 in 13db2fb
public KeySetView<K,V> keySet() { |
src/lib.rs, line 145 at r2 (raw file):
Nit: I prefer the Java comment's present-tense form here:
So we use a secondary strategy that applies when the number of nodes in a bin exceeds a threshold.
src/lib.rs, line 151 at r2 (raw file):
//! given that N cannot exceed `(1<<64)` (before running out of adresses) this bounds search steps, //! lock hold times, etc, to reasonable constants (roughly 100 nodes inspected per operation worst //! case) as keys are required to be comparable ([`Ord`](std::cmp::Ord)). `BinEntry::Tree` nodes
Nit: I think the note about keys being "comparable" here is unnecessary. It is only there in the Java case because they cannot guarantee a bounded number of search steps if the key is not comparable.
src/lib.rs, line 210 at r2 (raw file):
* Java's `Comparable` interface. */ //! `BinEntry::Tree` bins use a special form of comparison for search and related operations (which //! is the main reason they do not use existing collections such as tree maps). The contained tree
Nit: Here too I prefer the more casual tone used in the Java docs of "we cannot use".
src/lib.rs, line 218 at r2 (raw file):
//! always possible by readers even during updates, tree traversal is not, mainly because of //! tree-rotations that may change the root node and/or its linkages. Tree bins include a simple //! reaad-write lock mechanism parasitic on the main bin-synchronization strategy: Structural
Nit: "reaad" -> "read"
src/map.rs, line 935 at r2 (raw file):
tree_node.node.key.clone(), tree_node.node.value.clone(), hash,
Nit: very minor preference, but can we keep the constructor argument order the same as the Java code (hash
comes first)?
src/map.rs, line 995 at r2 (raw file):
// bin here, since it will be swapped for a Moved entry // while we are still behind the bin lock. if high_count != 0 {
Nit: clippy will probably complain about this not using else if
.
src/map.rs, line 1003 at r2 (raw file):
// safety: we have just created `low` and its `next` // nodes and have never shared them unsafe { TreeBin::drop_tree_nodes(low, false, guard) };
I wish there was a way for us to not even allocate nodes if we determine we can re-use the whole tree. Not sure if that'd work out in practice though. Especially because it would also save us from cloning all the keys and values.
src/map.rs, line 1034 at r2 (raw file):
// if we did not re-use the old bin, it is now garbage, // since all of its nodes have been reallocated. However, // we always re-use the stored values, so we can't drop those.
I think it should be fine to just use defer_destroy
here like in the linear case. Since the values are already behind Atomic
, they won't be automatically dropped just because the containing node is dropped.
src/map.rs, line 1227 at r2 (raw file):
where K: Borrow<Q>, Q: ?Sized + Hash + Ord,
I think you'll want to change this whole impl block to be K: Ord
rather than K: Eq
. Note also that we now require that the borrowed form of K
implements Ord
the same way that Q
does.
src/map.rs, line 1373 at r2 (raw file):
} impl<K, V, S> HashMap<K, V, S>
See the note above that should make splitting this impl block unnecessary.
src/map.rs, line 1414 at r2 (raw file):
} // slow path -- bin is non-empty
Not sure why this comment went away?
src/map.rs, line 1493 at r2 (raw file):
// unlink it from the map to prevent others from entering it tab.store_bin(idx, Shared::null()); // next, walk the nodes of the bin and free the nodes and their values as we go
I'm not sure I see why we have to walk the nodes of the tree while still holding the lock? Once we store NULL
, we must be the owners.
src/map.rs, line 1494 at r2 (raw file):
tab.store_bin(idx, Shared::null()); // next, walk the nodes of the bin and free the nodes and their values as we go // note that we do not free the head node yet, since we're holding the lock it contains
This comment is now a little misleading, since we're not freeing anything in this loop, only counting!
src/map.rs, line 1616 at r2 (raw file):
// fast path -- bin is empty so stick us at the front let node = Owned::new(BinEntry::Node(Node { key: key.clone(),
I wonder if we can be a little smarter here. Move the key
, and then move the key back out of changed.new
if the cas_bin
fails.
src/map.rs, line 1780 at r2 (raw file):
bin_count = 2; let p = tree_bin.find_or_put_tree_val(hash, key, value, guard); if !p.is_null() {
Can we perhaps invert this condition and break
to avoid indenting the whole body?
src/map.rs, line 1837 at r2 (raw file):
// reach this point if the bin changed while obtaining the lock. // However, our code doesn't (it uses continue) and `bin_count` // _cannot_ be 0 at this point.
Nit: should we assert that?
src/map.rs, line 1841 at r2 (raw file):
self.treeify_bin(t, bini, guard); } guard.flush();
In the old code, the flush
here is placed after the call to add_count
. Not sure if that makes a meaningful difference?
src/map.rs, line 1847 at r2 (raw file):
break; } // increment count, since we only get here if we did not return an old (updated) value
Nit: should we assert that old_val.is_none()
?
src/map.rs, line 2084 at r2 (raw file):
if !root.is_null() { new_val = { let p = TreeNode::find_tree_node(root, hash, key, guard);
Any particular reason why this isn't a method on TreeBin
?
src/map.rs, line 2135 at r2 (raw file):
// the beginning of the search, the value cannot be dropped until the // next epoch, which won't arrive until after we drop our guard. Some(unsafe { n.value.load(Ordering::SeqCst, guard).deref() })
Nit: We could probably avoid the load here if we just used Owned::into_shared
above. Alternatively, this can at least be downgraded to a Relaxed
load.
src/map.rs, line 2139 at r2 (raw file):
removed_node = true; // remove the BinEntry::TreeNode containing the removed key value pair from the bucket // safety: `p` is marked for garbage collection below if the bin gets untreeified
Not sure I follow this safety argument.
src/map.rs, line 2161 at r2 (raw file):
guard.defer_destroy(bin); guard.defer_destroy(p); }
BTW I wonder if we should start doing all defer_destroys
after dropping the head lock (this doesn't just apply to tree bins). May not make much of a difference, I'm not sure, but something worth thinking about. In particular, it would shorten the critical sections for inserts, which should in turn allow for more concurrency.
src/map.rs, line 2178 at r2 (raw file):
// However, our code doesn't (it uses continue) and making this // break conditional confuses the compiler about how many times we // use the `remapping_function`.
Nit: should we assert that bin_count != 0
?
src/map.rs, line 2251 at r2 (raw file):
} /// Replaces node value with `new_value`, conditional upon match of `observed_value`.
"conditional upon match" is really hard to read compared to "if it is equal to". We should fix the variable names, but retain the sentence structure.
src/map.rs, line 2304 at r2 (raw file):
let is_remove = new_value.is_none(); let mut old_val: Option<(&'g K, Shared<'g, V>)> = None;
Nit: these can both go outside the loop I believe.
src/map.rs, line 2402 at r2 (raw file):
let root = tree_bin.root.load(Ordering::SeqCst, guard); if !root.is_null() {
Can we perhaps invert this condition and break
to avoid indenting the whole body?
src/map.rs, line 2404 at r2 (raw file):
if !root.is_null() { let p = TreeNode::find_tree_node(root, hash, key, guard); if !p.is_null() {
Same here.
src/map.rs, line 2415 at r2 (raw file):
let pv = n.value.load(Ordering::SeqCst, guard); // just remove the node if the value is the one we expected at method call
This comment is a little misleading since we only remove if new_node.is_none()
.
src/map.rs, line 2571 at r2 (raw file):
} else { let bin = tab.bin(index, guard); if !bin.is_null() {
Can we invert this and return early to avoid indenting everything?
src/map.rs, line 2578 at r2 (raw file):
let lock = node.lock.lock(); // check if `bin` is still the head if tab.bin(index, guard) == bin {
Same here — let's invert and early return (that also matches the pattern we've used in replace_node
and put
of using continue
if current_head != bin
).
src/map.rs, line 2581 at r2 (raw file):
let mut e = bin; let mut head = Shared::null(); let mut last_tree_node = Shared::null();
Nit: any particular reason not to use tail
here to match the Java code and for duality with head
?
src/map.rs, line 2598 at r2 (raw file):
let new_tree_node = TreeNode::new( e_deref.key.clone(), e_deref.value.clone(), // this uses a load with Ordering::Relaxed, but write access is synchronized through the bin lock
Nit: this seems important enough that I'd place it on its own line above the call to TreeNode::new
with a // NOTE:
src/map.rs, line 2624 at r2 (raw file):
} tab.store_bin(index, Owned::new(BinEntry::Tree(TreeBin::new(head, guard)))); // make sure the old bin entries get dropped
We can do this after dropping the head lock.
src/map.rs, line 2653 at r2 (raw file):
} drop(lock); }
I'd like an else
here with unreachable!
to ensure we never try to treeify
a tree for example.
src/map.rs, line 2661 at r2 (raw file):
/// _not_ clean up old TreeNodes, as they may still be reachable. fn untreeify<'g>( node: Shared<'g, BinEntry<K, V>>,
Should this argument be called bin
instead of node
?
src/map.rs, line 2665 at r2 (raw file):
) -> Shared<'g, BinEntry<K, V>> { let mut head = Shared::null(); let mut last_node: Shared<'_, BinEntry<K, V>> = Shared::null();
Nit: any particular reason not to use tail
like the Java code? Also, why is it necessary to give the type explicitly?
src/map.rs, line 2670 at r2 (raw file):
// safety: we only untreeify sequences of TreeNodes which either // - were just created (e.g. in transfer) and are thus valid, or //
Nit: extraneous whitespace
src/map.rs, line 2677 at r2 (raw file):
// remains a valid pointer at least until our guard is dropped. let q_deref = unsafe { q.deref() }.as_tree_node().unwrap(); let new_node = Owned::new(BinEntry::Node(Node {
BTW This constructor is used a lot throughout the code. May be time we introduce a Node::new
.
src/map.rs, line 2677 at r2 (raw file):
// remains a valid pointer at least until our guard is dropped. let q_deref = unsafe { q.deref() }.as_tree_node().unwrap(); let new_node = Owned::new(BinEntry::Node(Node {
Doesn't TreeNode
already contain a Node
? Can't we just clone that directly?
src/node.rs, line 6 at r2 (raw file):
use parking_lot::Mutex; use std::borrow::Borrow; use std::thread::{current, park, Thread};
We may want to use parking_lot
's thread parking infrastructure here rather than std::thread
.
src/node.rs, line 141 at r2 (raw file):
let p_deref = unsafe { Self::get_tree_node(p) }; let p_left = p_deref.left.load(Ordering::SeqCst, guard); let p_right = p_deref.right.load(Ordering::SeqCst, guard);
We only need either p_left
or p_right
if p_hash != hash
. I think we should just read them in the specific branch arm where they're needed below to speed up the common case.
src/node.rs, line 149 at r2 (raw file):
continue; } if p_hash < hash {
Nit: should probably be an else if
.
src/node.rs, line 167 at r2 (raw file):
continue; } if p_right.is_null() {
Nit: should probably be an else if
. Alternatively could write this as a match
on (p_left.is_null(), p_right.is_null())
.
src/node.rs, line 173 at r2 (raw file):
// Otherwise, we compare keys to find the next child to look at. let dir = match p_key.borrow().cmp(&key) {
I don't think there is any reason to cast the result of cmp
to an integer and then if
on the integer below. Just use the Ordering
directly to decide p_left
vs p_right
.
src/node.rs, line 186 at r2 (raw file):
// _compare_ equal (k.compareTo(p_key) == 0). In this case, both // children are searched. Since `Eq` and `Ord` must match, these // cases cannot occur here.
BTW It is really weird to me that they also have an else
case here. What's that about?
src/node.rs, line 215 at r2 (raw file):
K: Ord, { pub(crate) fn new<'g>(bin: Shared<'g, BinEntry<K, V>>, guard: &'g Guard) -> Self {
It would be good to document somewhere that it is the TreeBin
constructor that actually constructs the red-black tree. Elsewhere in the code when we construct TreeNode
s (like in transfer
and treeify
), we just set next
, and rely on the actually treeification to happen when the TreeBin
wrapper is added.
src/node.rs, line 222 at r2 (raw file):
// safety: when creating a new TreeBin, the nodes used are created just // for this bin and not shared with another thread, so they cannot get // invalidated.
Can we have this reflect in the method signature somehow? Like take an Owned
?
src/node.rs, line 249 at r2 (raw file):
let p_key = &p_deref.node.key; let p_hash = p_deref.node.hash; let dir = match p_hash.cmp(&hash) {
Here too I think we shouldn't go via numbers. For the Equal
case, use Ordering::then
.
src/node.rs, line 263 at r2 (raw file):
let xp = p; p = if dir <= 0 { p_deref.left.load(Ordering::SeqCst, guard)
Since we are constructing the tree, I think we are the ones who set all the left
and right
fields, so a Relaxed
load should be sufficient.
src/node.rs, line 287 at r2 (raw file):
} if cfg!(debug) {
Nit: this should probably be cfg!(debug_assertions)
src/node.rs, line 350 at r2 (raw file):
} else if waiting { park(); }
This should probably have a call to spin_loop_hint
.
src/node.rs, line 419 at r2 (raw file):
}; if bin_deref.lock_state.fetch_add(-READER, Ordering::SeqCst) == (READER | WRITER) { // check if another thread is waiting and, if so, unpark it
Nit: I'd add "we were the last reader holding up a waiting writer".
src/node.rs, line 422 at r2 (raw file):
let waiter = &bin_deref.waiter.load(Ordering::SeqCst, guard); if !waiter.is_null() { // safety: we do not destroy waiting threads
Which is probably a problem. We should make sure to destroy the old Thread
handle whenever there is one when we update waiter
.
src/node.rs, line 433 at r2 (raw file):
} /// Unlinks the given node, which must be present before this call. This is
Nit: include a double newline before "This is" so we get nice docs.
src/node.rs, line 464 at r2 (raw file):
let next = p_deref.node.next.load(Ordering::SeqCst, guard); let prev = p_deref.prev.load(Ordering::SeqCst, guard); // unlink traversal pointers
Nit: let's add an empty line above this comment for clarity.
src/node.rs, line 491 at r2 (raw file):
// if we are now too small to be a `TreeBin`, we don't need to worry // about restructuring the tree since the bin will be untreeified // anyway, so we check that
A comment on what the heuristic here does would be good
src/node.rs, line 493 at r2 (raw file):
// anyway, so we check that let mut root = self.root.load(Ordering::SeqCst, guard); if root.is_null()
I think we can assert that !root.is_null()
, no?
src/node.rs, line 508 at r2 (raw file):
.left .load(Ordering::SeqCst, guard) .is_null()
What a bizarre heuristic, but I guess it is what the Java code does...
src/node.rs, line 535 at r2 (raw file):
successor_deref .red .store(p_deref.red.load(Ordering::Relaxed), Ordering::Relaxed);
Is the reason why this store
is okay to be Relaxed
in the context of lock-free reads the compare_and_swap
that find
attempts? Same with the various other Relaxed
stores below.
src/node.rs, line 540 at r2 (raw file):
let successor_right = successor_deref.right.load(Ordering::Relaxed, guard); let p_parent = p_deref.parent.load(Ordering::Relaxed, guard); if successor == p_right {
Nit: a comment here saying that we're going to swap p
and successor
would be good
src/node.rs, line 569 at r2 (raw file):
} } p_deref.left.store(Shared::null(), Ordering::Relaxed);
Nit: maybe assert successor_left.is_null()
?
src/node.rs, line 610 at r2 (raw file):
replacement = p_right; } else { replacement = p;
I think this case is unreachable, since it would imply that p
was the only node, which is already handled in the code before we take the root lock.
src/node.rs, line 613 at r2 (raw file):
} if replacement != p {
Even though it's basic binary tree stuff, I think it'd be good to remind the reader what's going on here. We have swapped p
with successor
, which immediately follows it in (hash, key)
order. We now want to get rid of p
completely. If p
now has no children (i.e., if successor
had no right child (it can never have a left child -- it wouldn't be the successor of p
)), then the job is trivial since p
is a leaf: we just unlink it from its parent (further below). If it has one child (which must be a right child), then we can just replace p
with that right child directly.
src/node.rs, line 648 at r2 (raw file):
); if p == replacement {
I wonder why this happens after the rebalancing rather than next to the if replacement == p
.
src/node.rs, line 651 at r2 (raw file):
let p_parent = p_deref.parent.load(Ordering::Relaxed, guard); if !p_parent.is_null() { if p == TreeNode::get_tree_node(p_parent)
Nit: probably worth storing TreeNode::get_tree_node(p_parent)
into a temporary variable here for clarity.
src/node.rs, line 668 at r2 (raw file):
} p_deref.parent.store(Shared::null(), Ordering::Relaxed);
Nit: should we assert here that left
and right
are both is_null
?
src/node.rs, line 678 at r2 (raw file):
// that pins an epoch <= our epoch, and thus have to be released before // `p` is actually dropped. #[allow(unused_unsafe)]
Why is this needed?
src/node.rs, line 684 at r2 (raw file):
} self.unlock_root();
We should hoist this unlock to above the defer_destroy
.
src/node.rs, line 685 at r2 (raw file):
self.unlock_root(); if cfg!(debug) {
Should be cfg!(debug_assertions)
src/node.rs, line 697 at r2 (raw file):
impl<K, V> TreeBin<K, V> where K: Ord,
This probably requires at least Send + Sync
.
src/node.rs, line 733 at r2 (raw file):
let p_deref = unsafe { TreeNode::get_tree_node(p) }; let p_hash = p_deref.node.hash; let dir = match p_hash.cmp(&hash) {
Same here as earlier: let's avoid the use of integers for comparisons.
src/node.rs, line 818 at r2 (raw file):
} if cfg!(debug) {
cfg!(debug_assertions)
src/node.rs, line 899 at r2 (raw file):
/// TreeNode. #[inline] pub(crate) unsafe fn get_tree_node<'g>(bin: Shared<'g, BinEntry<K, V>>) -> &'g TreeNode<K, V> {
Nit: this particular method probably doesn't belong in this impl
block given its comment.
src/node.rs, line 903 at r2 (raw file):
} fn rotate_left<'g>(
I think this (and all the code below) is executed while holding the root lock, which means the loads can at least be Relaxed
?
src/node.rs, line 918 at r2 (raw file):
let p_deref = unsafe { Self::get_tree_node(p) }; let right = p_deref.right.load(Ordering::SeqCst, guard); if !right.is_null() {
Nit: I'd invert this condition.
src/node.rs, line 964 at r2 (raw file):
let p_deref = unsafe { Self::get_tree_node(p) }; let left = p_deref.left.load(Ordering::SeqCst, guard); if !left.is_null() {
Nit: I'd invert this condition.
src/node.rs, line 1012 at r2 (raw file):
let mut x_parent_parent: Shared<'_, BinEntry<K, V>>; let mut x_parent_parent_left: Shared<'_, BinEntry<K, V>>; let mut x_parent_parent_right: Shared<'_, BinEntry<K, V>>;
All of the unsafe { Self::get_tree_node(...) }
noise makes the code below really hard to read :'( Maybe it'd be worthwhile to have a local single-letter macro for it?
src/node.rs, line 1157 at r2 (raw file):
.store(false, Ordering::Relaxed); return x; }
I believe this is missing this code segment from the Java code.
src/node.rs, line 1176 at r2 (raw file):
.red .store(true, Ordering::Relaxed); root = Self::rotate_left(root, x_parent, guard);
Minor: this pattern of rotating, updating x_parent
, and then setting some other var to some property of x_parent
if !x_parent.is_null()
comes up a bunch. May be worth providing a simple macro for it.
src/node.rs, line 1189 at r2 (raw file):
} if x_parent_right.is_null() { x = x_parent;
continue
would probably be clearer here
src/node.rs, line 1210 at r2 (raw file):
.red .store(true, Ordering::Relaxed); x = x_parent;
continue
src/node.rs, line 1292 at r2 (raw file):
} if x_parent_left.is_null() { x = x_parent;
continue
src/node.rs, line 1301 at r2 (raw file):
.load(Ordering::SeqCst, guard); if (s_right.is_null()
Nit: the Java code technically compares s_left
first and then s_right
here, though the outcome should be the same.
src/node.rs, line 1313 at r2 (raw file):
.red .store(true, Ordering::Relaxed); x = x_parent;
continue
src/node.rs, line 1373 at r2 (raw file):
} /// Checks invariants recursively for the tree of Nodes rootet at t. fn check_invariants<'g>(t: Shared<'g, BinEntry<K, V>>, guard: &'g Guard) -> bool {
Might be better to just stick the assert
s directly in here rather than return a bool
. That way we can actually include information about what specific assertion failed.
src/raw/mod.rs, line 164 at r2 (raw file):
} BinEntry::TreeNode(_) => break table.find(bin, hash, key, guard), BinEntry::Tree(_) => break table.find(bin, hash, key, guard),
Let's instead just make the first case BinEntry::Node(_) | BinEntry::TreeNode(_) | BinEntry::Tree(_) =>
.
src/raw/mod.rs, line 168 at r2 (raw file):
} } BinEntry::TreeNode(_) => {
Hmm, I don't think this case will ever be hit directly, as TreeBin::find
doesn't recurse to this method (at least I don't think).
src/raw/mod.rs, line 227 at r2 (raw file):
unreachable!(); }; drop(bin);
Nit: maybe add a comment here saying that the Drop
impl for Tree
takes care of freeing the tree nodes and values.
#73 should fix miri not running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working my way through this, I think I'll stick to Reviewable to be able to see where your comments are in the code. Will be a few batches, while most of the changes are not big, it's still a bit of time for each and I can't do all in one. Also, maybe some aren't as trivial as they seem, but see for yourself ^^
Reviewable status: all files reviewed, 79 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/lib.rs, line 64 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
HashSet
is not quite the same as what the Java code refers to as "set projection". What they are referring to is a set view of the map (which essentially gives you just the keys) as demonstrated here:.flurry/jsr166/src/ConcurrentHashMap.java
Line 1207 in 13db2fb
public KeySetView<K,V> keySet() {
The Java code has the KeySetView
wrapper, which is backed by a hash map. This is used for two things: You can statically create such a view, which will then be backed by a ConcurrentHashMap<K, Boolean>
that always contains true
for keys that are present. Or you can turn a concrete instance of a hash map into such a view if you provide an object of correct type to use a the "dummy" value for newly added keys.
If you use one of the static methods, what you get is essentially equivalent to a hash set. While you can get the backing map via CollectionView
, what do you do with a key -> true
mapping? If you just use it as-is, it is functionally the same as HashMap<K, ()>
, only Java doesn't have ()
.
There is of course a difference the non-static one where you actually obtain a view for an existing map. We don't have a direct equivalent for this, the closest thing in my mind was iterating over keys. Though I concede that keys()
is an iterator and the result returned by keySet()
is such a KeySetView
.
Do you have a concrete suggestion for what you would change here? Are you only against the nomenclature of a set projection or is your comment arguing in favor of removing this paragraph again/moving it somewhere else? I think we should definitely mention the possibility of using HashSet
somewhere in the global doc.
src/map.rs, line 995 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Nit: clippy will probably complain about this not using
else if
.
Interestingly, it only complained about the high_bin
case, which is why they look different now. Maybe the comment is holding it off? Will change though!
src/map.rs, line 1003 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I wish there was a way for us to not even allocate nodes if we determine we can re-use the whole tree. Not sure if that'd work out in practice though. Especially because it would also save us from cloning all the keys and values.
This would apply if the nodes in the tree either all hash to the low or all to the high bin, right? 'Cause if the tree is re-used because it's empty we have no nodes to create.
I was also annoyed that I had to create nodes and stick them behind shared pointers only to then clean them up again. But even if we could avoid allocations by performing some check first, I'm not sure if that would be worth it. I think in most non-pathological cases there will be both elements that hash to the low and such that hash to the high bin in the new table. It would be faster for the worst case of identical hashes though...
src/map.rs, line 1034 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think it should be fine to just use
defer_destroy
here like in the linear case. Since the values are already behindAtomic
, they won't be automatically dropped just because the containing node is dropped.
Dropping a TreeBin
will by default drop all of its nodes and their values. This is because if we drop the map, then we also drop the values. However, here we want to re-use the values, so we cannot rely on this Drop
impl.
Now that I'm looking at this again, I think it's the other way round. Other places in the map that remove tree bins should use logic like this one where values are not dropped, such as in replace_node
if a node gets removed. I wonder why this did not produce an error in the tests, maybe this case is not hit. Having this logic everywhere would of course raise the question whether the default for dropping a TreeBin
should be not to drop the node values, and have a separate, "manual" implementation in Table::drop_bins
for the case where we need to drop everything. However, clear
for example also need the value drop. What do you think?
src/map.rs, line 1227 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think you'll want to change this whole impl block to be
K: Ord
rather thanK: Eq
. Note also that we now require that the borrowed form ofK
implementsOrd
the same way thatQ
does.
Fair enough. Regarding the matching impls of Ord
, would you want to mention this somewhere? If so, what would you consider to be a good place to do so?
src/map.rs, line 1373 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
See the note above that should make splitting this impl block unnecessary.
Done.
src/map.rs, line 1414 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Not sure why this comment went away?
Done.
src/map.rs, line 1493 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I'm not sure I see why we have to walk the nodes of the tree while still holding the lock? Once we store
NULL
, we must be the owners.
I think you're right. Other threads that wait on the lock would re-check and see a different bin when the lock is released after that point. The tree bin case mirrors the regular case, which I would guess just has the lock matching the Java code's synchronized
. For some reason, they store the null
bin at the end of the work section instead of at the beginning. Want me to change that for both types of nodes?
src/map.rs, line 1494 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
This comment is now a little misleading, since we're not freeing anything in this loop, only counting!
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, no rush! And thanks for looking into this :D
Reviewable status: all files reviewed, 79 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/lib.rs, line 64 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
The Java code has the
KeySetView
wrapper, which is backed by a hash map. This is used for two things: You can statically create such a view, which will then be backed by aConcurrentHashMap<K, Boolean>
that always containstrue
for keys that are present. Or you can turn a concrete instance of a hash map into such a view if you provide an object of correct type to use a the "dummy" value for newly added keys.If you use one of the static methods, what you get is essentially equivalent to a hash set. While you can get the backing map via
CollectionView
, what do you do with a key ->true
mapping? If you just use it as-is, it is functionally the same asHashMap<K, ()>
, only Java doesn't have()
.
There is of course a difference the non-static one where you actually obtain a view for an existing map. We don't have a direct equivalent for this, the closest thing in my mind was iterating over keys. Though I concede thatkeys()
is an iterator and the result returned bykeySet()
is such aKeySetView
.Do you have a concrete suggestion for what you would change here? Are you only against the nomenclature of a set projection or is your comment arguing in favor of removing this paragraph again/moving it somewhere else? I think we should definitely mention the possibility of using
HashSet
somewhere in the global doc.
I was thinking mostly in terms of nomenclature. I agree that mentioning that we also have a HashSet
is a good idea, but I think "projection" implies that you can take a map and "get" a set, which isn't something we currently support, neither by conversion or as a view.
src/map.rs, line 995 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Interestingly, it only complained about the
high_bin
case, which is why they look different now. Maybe the comment is holding it off? Will change though!
Yeah, it's probably the comment!
src/map.rs, line 1003 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
This would apply if the nodes in the tree either all hash to the low or all to the high bin, right? 'Cause if the tree is re-used because it's empty we have no nodes to create.
I was also annoyed that I had to create nodes and stick them behind shared pointers only to then clean them up again. But even if we could avoid allocations by performing some check first, I'm not sure if that would be worth it. I think in most non-pathological cases there will be both elements that hash to the low and such that hash to the high bin in the new table. It would be faster for the worst case of identical hashes though...
I was thinking specifically of the case where they all go to the high or low bin, yes. I'm having a hard time guessing whether that will come up in practice or not. By pure chance, you'd think that when you have 6+ elements in a bin, at least one will go high and at least one will go low. But in cases where hashes are non-uniform (as you observe), that might not be the case, and this is one of the primary things the tree optimization should help against! The other case I think is during resizes, where you may continue inserting into a bin for a while after a resize has already been started, but that is probably a rarer case.
Ultimately, I'll leave this one up to you — I think we could go either way. Maybe it's worth erring on the side of less complexity.
src/map.rs, line 1034 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Dropping a
TreeBin
will by default drop all of its nodes and their values. This is because if we drop the map, then we also drop the values. However, here we want to re-use the values, so we cannot rely on thisDrop
impl.Now that I'm looking at this again, I think it's the other way round. Other places in the map that remove tree bins should use logic like this one where values are not dropped, such as in
replace_node
if a node gets removed. I wonder why this did not produce an error in the tests, maybe this case is not hit. Having this logic everywhere would of course raise the question whether the default for dropping aTreeBin
should be not to drop the node values, and have a separate, "manual" implementation inTable::drop_bins
for the case where we need to drop everything. However,clear
for example also need the value drop. What do you think?
Ah, yes, I hadn't seen the impl Drop
yet when I wrote that. I think we could go either way with this, so I trust your judgement as to which approach is more common (and thus better to be the "default"). If both types of dropping appear in multiple places, let's factor them both into functions, and then just call the appropriate function as needed.
As for the existing tests not catching this, let's try to fix that by introducing a test where it should trigger?
src/map.rs, line 1227 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Fair enough. Regarding the matching impls of
Ord
, would you want to mention this somewhere? If so, what would you consider to be a good place to do so?
The existing requirement about matching Eq
implementations already appears in the docs for get
, and other methods that take a K: Borrow<Q>
(like get_key_value
), so I think this is just a matter of updating the text there.
src/map.rs, line 1493 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
I think you're right. Other threads that wait on the lock would re-check and see a different bin when the lock is released after that point. The tree bin case mirrors the regular case, which I would guess just has the lock matching the Java code's
synchronized
. For some reason, they store thenull
bin at the end of the work section instead of at the beginning. Want me to change that for both types of nodes?
Yeah, I don't understand why they do that. My instinct is to trust that they knew what they were doing, though this seems like it really should be fine for exactly the reason you outline. My inclination is to change it for both types of notes, and then add a comment with NOTE:
that we differ from the Java impl here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 78 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/lib.rs, line 64 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I was thinking mostly in terms of nomenclature. I agree that mentioning that we also have a
HashSet
is a good idea, but I think "projection" implies that you can take a map and "get" a set, which isn't something we currently support, neither by conversion or as a view.
Done.
src/map.rs, line 1003 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I was thinking specifically of the case where they all go to the high or low bin, yes. I'm having a hard time guessing whether that will come up in practice or not. By pure chance, you'd think that when you have 6+ elements in a bin, at least one will go high and at least one will go low. But in cases where hashes are non-uniform (as you observe), that might not be the case, and this is one of the primary things the tree optimization should help against! The other case I think is during resizes, where you may continue inserting into a bin for a while after a resize has already been started, but that is probably a rarer case.
Ultimately, I'll leave this one up to you — I think we could go either way. Maybe it's worth erring on the side of less complexity.
The main thing I would consider ist exactly that tree bins are partly a protective measure in the case of highly skewed hash distributions. Ultimately though, we do not guarantee any protection and I this doesn't feel like it warrants hitting the regular case with appropriate hash functions and the increased complexity you already mentioned. Then again, I'm by no means an expert on this, but I'll leave it as-is for now.
src/map.rs, line 1034 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Ah, yes, I hadn't seen the
impl Drop
yet when I wrote that. I think we could go either way with this, so I trust your judgement as to which approach is more common (and thus better to be the "default"). If both types of dropping appear in multiple places, let's factor them both into functions, and then just call the appropriate function as needed.As for the existing tests not catching this, let's try to fix that by introducing a test where it should trigger?
Adding this test took me... ages. I involuntarily triggered a different bug where if nodes are removed from a tree which does not get untreeified their values are also deleted, but replace_node
deletes old values across cases, so the value got dropped twice. Missed that when moving code around between inside and outside the match
cases.
As for the different drop
cases, I don't know which case will be more common across all map methods (we are still missing a lot compared to Java), for now it's fairly even. But having Drop
clean up everything associated with the bin makes more sense to me intuitively than the other way around, so I'd do it that way.
src/map.rs, line 1227 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
The existing requirement about matching
Eq
implementations already appears in the docs forget
, and other methods that take aK: Borrow<Q>
(likeget_key_value
), so I think this is just a matter of updating the text there.
Done.
src/map.rs, line 1493 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Yeah, I don't understand why they do that. My instinct is to trust that they knew what they were doing, though this seems like it really should be fine for exactly the reason you outline. My inclination is to change it for both types of notes, and then add a comment with
NOTE:
that we differ from the Java impl here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I left a lot of comments. Sorry about that 😅
Reviewable status: all files reviewed, 77 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/lib.rs, line 64 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Done.
Maybe this is something weird about reviewable, but I can't see any of your changes? Have you pushed them?
src/map.rs, line 1034 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Adding this test took me... ages. I involuntarily triggered a different bug where if nodes are removed from a tree which does not get untreeified their values are also deleted, but
replace_node
deletes old values across cases, so the value got dropped twice. Missed that when moving code around between inside and outside thematch
cases.As for the different
drop
cases, I don't know which case will be more common across all map methods (we are still missing a lot compared to Java), for now it's fairly even. But havingDrop
clean up everything associated with the bin makes more sense to me intuitively than the other way around, so I'd do it that way.
Haha, it sounds like it was worth writing the test then!
Given that both appear in multiple places, I think we should then factor out this code (and the "real drop" code) into their own functions, and then call them from the appropriate places. This would also make it relatively simple to change the "default" behavior in Drop
later.
Yeah, that's why I said I'll do a few batches ^^ I didn't push any changes yet, I still need to rebase in the fix for miri. Will push after the next set of updates so you can see what has changed. But that test thing took so long that I didn't want to get drawn into fixing more stuff in case ASAN complains or something 😅 The code for the different kind of drop is already methods, this should be good now 👍 |
Let's hope that reviewable deals better with rebases than GitHub does. That's the primary reason why I'm trying it out 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got through all of map.rs
now. Will push an update after posting this and doing the rebase.
Reviewable status: all files reviewed, 70 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/map.rs, line 1616 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I wonder if we can be a little smarter here. Move the
key
, and then move the key back out ofchanged.new
if thecas_bin
fails.
Done.
src/map.rs, line 1780 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Can we perhaps invert this condition and
break
to avoid indenting the whole body?
Done.
src/map.rs, line 1841 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
In the old code, the
flush
here is placed after the call toadd_count
. Not sure if that makes a meaningful difference?
I placed it at this point to make sure it gets hit even if we updated an existing value, in which case the old value is returned immediately below. To be honest I don't know whether this matters, but if you prefer we could flush here only if we return, and then flush after add_count
if we don't?
src/map.rs, line 1847 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Nit: should we assert that
old_val.is_none()
?
This would require moving old_val
out a scope. Does that matter?
src/map.rs, line 2084 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Any particular reason why this isn't a method on
TreeBin
?
There are a few cases where methods that are instance methods in the Java code ended up as static methods, mostly due to handling shared references and lifetimes. In this case, findTreeNode
is originally an instance method of TreeNode
. But since the method has an equivalent on the Java side, I think it's good to keep it where you'd expect (or support the correct expectation of where to find it in the Java code).
src/map.rs, line 2135 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Nit: We could probably avoid the load here if we just used
Owned::into_shared
above. Alternatively, this can at least be downgraded to aRelaxed
load.
Good idea! Again a case where this behaviour is shared with the regular case, so I'll update both.
src/map.rs, line 2139 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Not sure I follow this safety argument.
The problem with remove_tree_node
is the following: The method first unlinks the given node's linear traversal pointers (next
and previous
). Then, it checks if the bin is now too small to be a tree bin. If so, no other operations are performed and the method returns early, indicating that untreeify
is necessary. Only if the bin remains a tree is the given node unlinked from the tree.
Now, the method only takes the additional tree bin lock if it actually needs to modify the tree. This means that, until the bin is untreeified and swapped out for a linear bin, other threads can freely read (not write, because we hold the bin lock) using the fast path along the tree ordering. In particular, they can still obtain references to the "removed" node. So remove_tree_node
only marks p
for garbage collection if it is also removed from the tree, since then no new references can be obtained. Otherwise, the caller needs to clean up p
after the bin was replaced with a linear one.
So using the method is not technically "unsafe" if you don't follow this, it "only" leaks memory. I wanted to ensure that people are aware when they use the method in the future, and the best thing I could come up with was unsafe
. If you have a better alternative, I'm very open to changing this.
src/map.rs, line 2161 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
BTW I wonder if we should start doing all
defer_destroys
after dropping the head lock (this doesn't just apply to tree bins). May not make much of a difference, I'm not sure, but something worth thinking about. In particular, it would shorten the critical sections for inserts, which should in turn allow for more concurrency.
I agree with the sentiment, but what do we do if garbage collection is different depending on branching inside the critical section (such as here)? You'd have to reconstruct what happened when you've dropped the lock and then do the appropriate gc, which will probably need references that have gone out of scope and so on. I'll keep an eye out for places where this seems doable, but I wouldn't start refactoring everything.
src/map.rs, line 2251 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
"conditional upon match" is really hard to read compared to "if it is equal to". We should fix the variable names, but retain the sentence structure.
The previous sentence was ambiguous, the extra value is checked against the previous value for that key, not the given value (which would of course be a useless feature, but still). I wrote this out in the updated code.
src/map.rs, line 2402 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Can we perhaps invert this condition and
break
to avoid indenting the whole body?
Done.
src/map.rs, line 2404 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Same here.
Done.
src/map.rs, line 2415 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
This comment is a little misleading since we only remove if
new_node.is_none()
.
I didn't even read this before, lol. Changed both occurrences to "only replace".
src/map.rs, line 2571 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Can we invert this and return early to avoid indenting everything?
Done.
src/map.rs, line 2578 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Same here — let's invert and early return (that also matches the pattern we've used in
replace_node
andput
of usingcontinue
ifcurrent_head != bin
).
Done. I feel like this would be a good thing for clippy to report, at least if there is no other branches...
src/map.rs, line 2624 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
We can do this after dropping the head lock.
Done.
src/map.rs, line 2653 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I'd like an
else
here withunreachable!
to ensure we never try totreeify
a tree for example.
Done.
src/map.rs, line 2661 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Should this argument be called
bin
instead ofnode
?
Done.
src/map.rs, line 2665 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Nit: any particular reason not to use
tail
like the Java code? Also, why is it necessary to give the type explicitly?
Done. Regarding the type, I really don't know. I would have thought that the type can be inferred at least from storing new_node
(which is all that is done with head
, but conditionally), but the compiler complains if they are not there :/
src/map.rs, line 2677 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
BTW This constructor is used a lot throughout the code. May be time we introduce a
Node::new
.
A quick search gives me 5 instances in map.rs
and node.rs
. 3 pass an existing Atomic<V>
and 2 an Atomic::from
. 1 has a next
pointer, 3 don't and TreeNode::new
passes next
as an argument, which is currently also always null
, but can't be removed. So I added
Node::new(hash: u64, key: K, value: Atomic<V>) -> Self
and
Node::with_next(hash: u64, key: K, value: Atomic<V>, next: Atomic<BinEntry<K, V>> ) -> Self
with null
and Mutex::new(())
as defaults.
src/map.rs, line 2677 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Doesn't
TreeNode
already contain aNode
? Can't we just clone that directly?
Right now Node
is not Clone
. I assume this is primarily due to the contained lock (parking_lot::Mutex
). We could impl Clone
to return a node with a new lock, though the question is whether the new node would share next
pointer. In this instance, the node receives its own next
node below.
src/node.rs, line 6 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
We may want to use
parking_lot
's thread parking infrastructure here rather thanstd::thread
.
Can you clarify what you mean by "italic may"? In #13 (comment), you argued against changes only for no_std
. Is there another reason you think parking_lot
would be better?
adds: * TreeNode and TreeBin structs * TreeBin::new() * red-black tree methods
add insertion, removal and find for TreeBins adds the additional write lock for TreeBins that prevents concurrent tree restructuring add conversions to and from TreeBins have tables drop TreeBins when dropping bins
adds TreeBins to transfer, clear, put, iterators, compute_if_present, replace_node
extends crate-level documentation with TreeBins
6f3f51c
to
01fc0f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r3.
Reviewable status: all files reviewed, 58 unresolved discussions (waiting on @domenicquirl)
src/map.rs, line 1841 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
I placed it at this point to make sure it gets hit even if we updated an existing value, in which case the old value is returned immediately below. To be honest I don't know whether this matters, but if you prefer we could flush here only if we return, and then flush after
add_count
if we don't?
I forget exactly why we do a flush
in the first place. I think the thinking was that if you end up doing a resize, you want to make sure any garbage we produced as a result of the resize gets collected promptly. But if so, the flush
should arguably just be in transfer
somewhere, and not here anyway. Maybe just remove it?
src/map.rs, line 1847 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
This would require moving
old_val
out a scope. Does that matter?
I think that's worth it for this assert.
src/map.rs, line 2139 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
The problem with
remove_tree_node
is the following: The method first unlinks the given node's linear traversal pointers (next
andprevious
). Then, it checks if the bin is now too small to be a tree bin. If so, no other operations are performed and the method returns early, indicating thatuntreeify
is necessary. Only if the bin remains a tree is the given node unlinked from the tree.Now, the method only takes the additional tree bin lock if it actually needs to modify the tree. This means that, until the bin is untreeified and swapped out for a linear bin, other threads can freely read (not write, because we hold the bin lock) using the fast path along the tree ordering. In particular, they can still obtain references to the "removed" node. So
remove_tree_node
only marksp
for garbage collection if it is also removed from the tree, since then no new references can be obtained. Otherwise, the caller needs to clean upp
after the bin was replaced with a linear one.So using the method is not technically "unsafe" if you don't follow this, it "only" leaks memory. I wanted to ensure that people are aware when they use the method in the future, and the best thing I could come up with was
unsafe
. If you have a better alternative, I'm very open to changing this.
Ah, sorry, my comment wasn't very clear. I understand why the code needs to do what it does to avoid leaks, I more mean that I don't follow the safety comment. As in, "p
is marked for garbage collection below if the bin gets untreeified" does not tell me why the unsafe call below is safe.
I think it's perfectly reasonable for that method to be marked unsafe :)
src/map.rs, line 2161 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
I agree with the sentiment, but what do we do if garbage collection is different depending on branching inside the critical section (such as here)? You'd have to reconstruct what happened when you've dropped the lock and then do the appropriate gc, which will probably need references that have gone out of scope and so on. I'll keep an eye out for places where this seems doable, but I wouldn't start refactoring everything.
I think the idea would be to essentially keep a vec or something of garbage declared at the top of the function, push to that when we have garbage, and then call defer_destroy
on all of them at the end. Or something like that.
src/map.rs, line 2578 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Done. I feel like this would be a good thing for clippy to report, at least if there is no other branches...
Hehe, I agree with you! Maybe that's another PR idea :p It's probably a little tricky to detect whether it should be a continue
, break
, or return
though. But perhaps it should just point out that you could transform without having to say how you should transform.
src/map.rs, line 2665 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Done. Regarding the type, I really don't know. I would have thought that the type can be inferred at least from storing
new_node
(which is all that is done withhead
, but conditionally), but the compiler complains if they are not there :/
🤷♂️ The compiler works in mysterious ways.
src/map.rs, line 2677 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
A quick search gives me 5 instances in
map.rs
andnode.rs
. 3 pass an existingAtomic<V>
and 2 anAtomic::from
. 1 has anext
pointer, 3 don't andTreeNode::new
passesnext
as an argument, which is currently also alwaysnull
, but can't be removed. So I addedNode::new(hash: u64, key: K, value: Atomic<V>) -> Selfand
Node::with_next(hash: u64, key: K, value: Atomic<V>, next: Atomic<BinEntry<K, V>> ) -> Selfwith
null
andMutex::new(())
as defaults.
Seems good! You could make it generic over AV: Into<Atomic<V>>
, which would let the caller use either Atomic<V>
or V
(I think). But up to you -- it's not important.
src/map.rs, line 2677 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Right now
Node
is notClone
. I assume this is primarily due to the contained lock (parking_lot::Mutex
). We could implClone
to return a node with a new lock, though the question is whether the new node would sharenext
pointer. In this instance, the node receives its ownnext
node below.
Good point. I'm happy to keep it as-is.
src/map.rs, line 1434 at r3 (raw file):
// NOTE: The Java code stores the null bin _after_ the loop, and thus also has // to hold the lock until that point. However, after the store happens new // threads and threads waiting on the lock will read the new bin.
I think this comment is missing the implication: "so we can drop the lock early and do the dropping without the lock.
src/map.rs, line 1484 at r3 (raw file):
// NOTE: The Java code stores the null bin _after_ the loop, and thus also has // to hold the lock until that point. However, after the store happens new // threads and threads waiting on the lock will read the new bin.
Same note here as above.
src/map.rs, line 1771 at r3 (raw file):
let p = tree_bin.find_or_put_tree_val(hash, key, value, guard); if p.is_null() { break;
Nit: might be good to include a brief comment here along the lines of "The key did not exist so the put immediately succeeded, and we're done".
src/map.rs, line 2253 at r3 (raw file):
/// Replaces node value with `new_value`. /// If an `observed_value` is provided, the replacement only happens if `observed_value` equals /// the value that was previously associated with the given key.
I think this should say "is currently associated". Since the replacement has not happened yet.
Also, can we place an empty line above this (as in, after the first line), just to make the docs come out a little nicer.
src/map.rs, line 2408 at r3 (raw file):
let root = tree_bin.root.load(Ordering::SeqCst, guard); if root.is_null() { break;
A brief comment for why we break here would be good.
In fact, is it even possible for us to hit this? Shouldn't the node have been untreeified before this can happen?
src/map.rs, line 2412 at r3 (raw file):
let p = TreeNode::find_tree_node(root, hash, key, guard); if p.is_null() { break;
Nit: A brief comment for why we break here would be good.
src/map.rs, line 2669 at r3 (raw file):
} } else { unreachable!("treeifying something that is not a linear bin");
Nit: from experience, it's super useful to here print what it was, so that we have a bit more to go on if someone later reports seeing this panic.
src/map.rs, line 3233 at r3 (raw file):
map.insert(i, i, guard); } // Ensure the bin was correctly treeified
It's probably sufficient to have one test that tests treefication using insert
. Or at least we should factor it out into its own method if many tests will have this code.
src/node.rs, line 6 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Can you clarify what you mean by "italic may"? In #13 (comment), you argued against changes only for
no_std
. Is there another reason you thinkparking_lot
would be better?
The parking lot thread parking/wakeup mechanism is in general more efficient than the one in std
. I'm not talking about parking_lot_core
here (which would be no_std
), but the optimized implementation in parking_lot
(which is not no_std
).
src/node.rs, line 871 at r3 (raw file):
guard: &'g Guard, ) { guard.defer_unchecked(move || {
I don't have a good sense for the efficiency of defer_unchecked
and defer_destroy
. I think the latter might be more efficient though, since it doesn't have to pass along a full closure. Could we perhaps hoist the if let
out, pass a Guard
to drop_fields
and then have it call defer_destroy
internally?
01fc0f6
to
be9aa03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 10 files reviewed, 54 unresolved discussions (waiting on @domenicquirl and @jonhoo)
src/map.rs, line 1841 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I forget exactly why we do a
flush
in the first place. I think the thinking was that if you end up doing a resize, you want to make sure any garbage we produced as a result of the resize gets collected promptly. But if so, theflush
should arguably just be intransfer
somewhere, and not here anyway. Maybe just remove it?
It is not in transfer
at the moment, and the reasoning makes sense to me. I'll move it behind add_count
for now so we get that behaviour, and then maybe a separate change can consolidate flush
into transfer
. This PR is already doing a lot "on the side" I feel ^^
src/map.rs, line 2139 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Ah, sorry, my comment wasn't very clear. I understand why the code needs to do what it does to avoid leaks, I more mean that I don't follow the safety comment. As in, "
p
is marked for garbage collection below if the bin gets untreeified" does not tell me why the unsafe call below is safe.I think it's perfectly reasonable for that method to be marked unsafe :)
I've expanded a bit, it just answers the safety condition of the method - the provided node and, if desired, its value, have to be manually garbage collected when they are no longer reachable in case of the early return.
src/map.rs, line 2161 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think the idea would be to essentially keep a vec or something of garbage declared at the top of the function, push to that when we have garbage, and then call
defer_destroy
on all of them at the end. Or something like that.
Okay, this also seems to me like it could/should be its own change if it moves around a good amount of code. Also, defer_destroy
essentially defers a closure that turns the given Shared
into an Owned
and then drops it. A reference to the closure is stored in a slice internally, and the closure is only allocated on the heap if it exceeds a certain size (which I would guess doesn't happen that often for two calls?). Do you think having our own "defer" would even be that much faster if we still do that in the critical section?
src/map.rs, line 2677 at r2 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Seems good! You could make it generic over
AV: Into<Atomic<V>>
, which would let the caller use eitherAtomic<V>
orV
(I think). But up to you -- it's not important.
OK, is good idea, did do!
src/map.rs, line 1434 at r3 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think this comment is missing the implication: "so we can drop the lock early and do the dropping without the lock.
Done.
src/map.rs, line 1484 at r3 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Same note here as above.
Done.
src/map.rs, line 2253 at r3 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think this should say "is currently associated". Since the replacement has not happened yet.
Also, can we place an empty line above this (as in, after the first line), just to make the docs come out a little nicer.
Done.
src/map.rs, line 2408 at r3 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
A brief comment for why we break here would be good.
In fact, is it even possible for us to hit this? Shouldn't the node have been untreeified before this can happen?
It should, again because of the bin lock and the following re-check. But the check is in multiple places in the Java code and also remove_tree_node
checks a lot of cases that have too few nodes to actually occur. I would guess that the intention is to have the code be robust against changing the UNTREEIFY_THRESHOLD
. There still probably shouldn't be empty bins (cause when they get empty they get untreeified the latest), but I don't exactly feel confident to remove all checks.
src/map.rs, line 3233 at r3 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
It's probably sufficient to have one test that tests treefication using
insert
. Or at least we should factor it out into its own method if many tests will have this code.
It's not mainly about triggering treeification, that's just necessary to test tree bin removal. I'd like to keep both tests, since the behaviour between replace_node
and compute_if_present
differs with regards to garbage collection (see the lost usize
I had to chase around to make ASAN happy again 😅 ). But I refactored the test to take a closure.
be9aa03
to
beab51a
Compare
The comment about the efficiency of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think one day we'll have to make a pass where we try to optimize the garbage collection, given that that's where our current bottleneck appears to be.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 50 unresolved discussions (waiting on @domenicquirl)
src/map.rs, line 1841 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
It is not in
transfer
at the moment, and the reasoning makes sense to me. I'll move it behindadd_count
for now so we get that behaviour, and then maybe a separate change can consolidateflush
intotransfer
. This PR is already doing a lot "on the side" I feel ^^
It is, that is true. Once we finish it all up, it might be worth at least splitting out the most obvious stand-alone changes into different commits, but that's a bunch of extra work on your plate that I understand if you want to avoid :) The biggest reason for doing it would be that it makes git blame/bisect work much better should any of this produce an issue in the future. I'll leave that as your call. We can probably still land it all as one PR — I'll just do a merge instead of a squash.
src/map.rs, line 2161 at r2 (raw file):
Previously, domenicquirl (DQ) wrote…
Okay, this also seems to me like it could/should be its own change if it moves around a good amount of code. Also,
defer_destroy
essentially defers a closure that turns the givenShared
into anOwned
and then drops it. A reference to the closure is stored in a slice internally, and the closure is only allocated on the heap if it exceeds a certain size (which I would guess doesn't happen that often for two calls?). Do you think having our own "defer" would even be that much faster if we still do that in the critical section?
Ah, if that's the case, then I agree having the closure this way is fine. I feel like it should be possible to optimize defer_destroy
more (e.g., by passing around dyn Drop
), but let's leave that for later.
I also think it's fine to leave the "reclaim after critical section" for a different change. Add it to the pile of "eventually TODO" :p Maybe open an issue?
src/map.rs, line 2408 at r3 (raw file):
Previously, domenicquirl (DQ) wrote…
It should, again because of the bin lock and the following re-check. But the check is in multiple places in the Java code and also
remove_tree_node
checks a lot of cases that have too few nodes to actually occur. I would guess that the intention is to have the code be robust against changing theUNTREEIFY_THRESHOLD
. There still probably shouldn't be empty bins (cause when they get empty they get untreeified the latest), but I don't exactly feel confident to remove all checks.
Makes sense. How about we add a:
// TODO: Is this even reachable?
// The Java code has `NULL` checks for this, but in theory it should not be possible to
// encounter a tree that has no root when we have its lock. It should always have at
// least `UNTREEIFY_THRESHOLD` nodes. If it is indeed impossible we should replace
// this with an assertion instead.
Both here and to the one or two other places where we have if root.is_null()
.
src/map.rs, line 3233 at r3 (raw file):
Previously, domenicquirl (DQ) wrote…
It's not mainly about triggering treeification, that's just necessary to test tree bin removal. I'd like to keep both tests, since the behaviour between
replace_node
andcompute_if_present
differs with regards to garbage collection (see the lostusize
I had to chase around to make ASAN happy again 😅 ). But I refactored the test to take a closure.
Oh, yeah, that makes sense. It was also a little subtle to me that compute_if_present
actually ends up doing removals. I almost wonder if the name should change to reflect the fact that it might actually remove things too. Not sure how though.
src/node.rs, line 77 at r4 (raw file):
AV: Into<Atomic<V>>, { Node {
Nit: this could just be Node::with_next(hash, key, value, Atomic::null())
3778457
to
468bc6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 10 files reviewed, 1 unresolved discussion (waiting on @jonhoo)
src/node.rs, line 379 at r7 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
I think what was confusing me was that we had both the bin lock,
lock_state
(which is also a lock), and thewrite_lock
. Now that we're back to just the bin lock andlock_state
(I think?), it makes more sense to me.
Well the additional locking mechanism is still there in parking the waiting thread, its mutex is just not exposed anymore as part of TreeBin
. It's still there, but hidden inside Thread
(which, in fact, again uses something like lock_state
to handle tokens, in addition to convar/mutex). It's just you need to have the additional lock, and it has a synchronization/messaging part (lock_state
) and a waiting part if you need the lock (waiter
and park()
/ Condvar
+ Mutex
). But with notify
we would have to keep notify
ing until the waiting actually calls wait()
, which we probably don't want to do anyway.
src/node.rs, line 387 at r8 (raw file):
Previously, jonhoo (Jon Gjengset) wrote…
Should this assert that
waiter
is null? If it's not null, it'd need to be safely dropped.
Sure, we can do that! This should be guaranteed by the bin lock, but I've stuck an assertion in to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r9.
Reviewable status: complete! all files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! I guess now all that remains is merging, and then we're good to go?
Reviewable status: complete! all files reviewed, all discussions resolved
Codecov Report
|
It is damn confusing to merge in an explicit drop of a node whose creation I removed ^^ |
Checks passing now! 🎉 Would be nice to have some more tests for the more delicate parts of TreeBins, but these are also kind of hart to trigger. For example, it would be nice to have a test which always triggers a writing thread waiting for the additional lock. How do you want to proceed? I can't really tell how much of this is on me and what was merged in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 I think really what we'd want here is to run tests through loom, which could reliably trigger the various interleavings here. Without that, I think the interrim test is going to be using the OneBucketHasher
in a test with lots of reader and writer threads just spinning on random keys in a somewhat limited range. Could you add something like that and see if that helps coverage?
Reviewed 5 of 5 files at r10, 1 of 1 files at r11.
Reviewable status: complete! all files reviewed, all discussions resolved
Looks like it slightly helps |
Looking through the coverage diff, it doesn't actually look too bad. I think one thing that's not being tested is deletes, which may cause untreefication and such. Since the set of keys is small, I think this also maybe mostly ends up testing value overwrites rather than tree restructuring (like rotations), so maybe the keyspace actually needs to be a decent amount larger (and have deletes)? |
But do I read the individual file view correctly in that const NUM_THREADS: usize = 20;
const NUM_REPEATS: usize = 1000;
const NUM_KEYS: usize = 1000; is the last configuration that runs in a somewhat reasonable time. Both running 1000 repeats with 10000 different keys and 10000 repeats with 1000 different keys takes a minute plus. Both on 10000 took a whopping 10 minutes to run. |
Yeah, I feel like we really want loom here 😅 |
less threads, but more repetitions and bigger key range
That seems to have done quite a lot still. It's only about half the impact for I think Any more comments or ideas? Otherwise I think we can finally land this! |
The stuff in It also isn't hitting things like a bin being removed just as someone tries to insert into it, but I think that's okay, and will require loom. Some of the coverage is also just incorrectly marked as untested, probably due to limitations in the coverage sampler (like Overall, I'm pretty happy though. We will want loom for this regardless. Maybe try to get coverage for the drop, and try fewer writers (both insert and remove) but with the same number of readers, and then we call it good? |
Looks like |
The Similarly, I would assume that the drop method is not hit by
|
Awesome! Thank you for the amazing work! |
This PR ports an optimization from the Java code to flurry which replaces linear bins with bins containing ordered trees for bins containing many elements (in case of many hash collisions).
TreeBins and TreeNodes are introduced as structs with all associated methods, including the red-black tree implementation and methods to read from or insert into a tree bin. Handling for tree bins is added to all methods of
HashMap
that are sensitive to bin content/the type of bins, including iterators. Documentation is provided for the implementation, in particular the crate-level implementation notes are updated.Documentation changes may also include minor modifications to where I noticed things to be out-of-date. These are small across the board, notably I included some information on
HashSet
s in the crate-level documentation, since they were added with #19 (which means we should also probably re-purpose #17 to address missing methods orHashSetRef
only, or close it and move these to a new issue).Since I did not write most of the methods of
HashMap
that I had to touch, please review these particularly vigilantly. In particular the iterator code, since this was implemented in my absence.Closes #13.
This change is