Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduces the 'mangle' abstraction that was removed in the switch to robinhood hashing. #13197

Closed
wants to merge 1 commit into from

Conversation

cgaebel
Copy link
Contributor

@cgaebel cgaebel commented Mar 29, 2014

Fixes #13195.

@flaper87
Copy link
Contributor

Could you please amend the commit message with a better description? (and perhaps squash these commits) Thanks

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 29, 2014

Done.

@lilyball lilyball changed the title Fixed #13195 Reintroduces the 'mangle' abstraction that was removed in the switch to robinhood hashing. Mar 29, 2014
@lilyball
Copy link
Contributor

When fixing commit messages you should also fix PR title/description. I went ahead and updated them for you.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 29, 2014

Thanks a lot!

@sfackler
Copy link
Member

IIRC find_or_insert, find_or_insert_with, and insert_or_update_with were all implemented on top of mangle originally. Is that still applicable with the new hashmap implementation?

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 29, 2014

Uhh you CAN do it that way. But unless the compiler makes some pretty crazy inlining decisions there will be unnecessary closure creation.

@emberian
Copy link
Member

I, too, would rather avoid closures if possible.

@huonw
Copy link
Member

huonw commented Mar 29, 2014

Can we measure the cost in this situation before making assumptions? :)

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 29, 2014

It seems a little odd to change already-clear code to use a confusing function (I find mangle to be a poor abstraction that hurts my head a little), into something that will likely be slower, but MAY be the same speed.

Also, without labeled arguments, using mangle looks like crap at the call site.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 29, 2014

I also took the liberty of removing some instances of useless double hashing in this patch.

@lilyball
Copy link
Contributor

@cgaebel Because it's less code, and the implementations of find_or_insert() and friends are each one-line implementations using mangle().

pub fn find_or_insert<'a>(&'a mut self, k: K, v: V) -> &'a mut V {
    self.mangle(k, v, |_, v| v, |_, _, _| ())
}

pub fn find_or_insert_with<'a>(&'a mut self, k: K, f: |&K| -> V) -> &'a mut V {
    self.mangle(k, (), |k, _| f(k), |_, _, _| ())
}

pub fn insert_or_update_with<'a>(&'a mut self, k: K, v: V, f: |&K, &mut V|) -> &'a mut V {
    self.mangle(k, v, |_, v| v, |k, v, _| f(k, v))
}

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

I agree that they are one line implementations.

My (possibly flawed) intuition says that these three functions will get inlined into their caller, and mangle will stay out of line. If that is the case, then there will be needless closure creation.

I also find code like self.mangle(k, v, |_, v| v, |k, v, _| f(k, v)) hard to read and understand without referencing the type signature. It's code-golf worthy.

@lilyball
Copy link
Contributor

@cgaebel I never trust intuition when thinking about inlining. If LLVM thinks that inlining those functions into their caller and creating the closures is better than inlining mangle into those functions and skipping the closures, well, that's entirely up to it.

As for code golf, it's a bit messy, but luckily the callers of those functions don't have to ever look at their implementation. As for people who do read the source, it's pretty easy to verify that those functions are correct (and in fact it's difficult to implement them incorrectly and still typecheck). Additionally, once written, these functions never have to change, even if the implementation of the hashmap changes again. With these versions, only mangle() needs to be updated. Without them, all 3 of these functions also need to be updated, and checked for correctness, etc.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

Well, no they don't need to be rewritten if the implementation changes. They don't do anything hashtable-implementation specific. I can, however, more easily reason about their performance. And I don't have to "trust LLVM" to not create closures.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

Also, @kballard, do you have any thoughts on what I said in the bug report (replicated below)? I feel like mangle is the wrong abstraction to make:

I feel like there's room for a more usable and powerful abstraction here. Something like:

pub struct HashWithKey<K, 'a> {
  priv hash: SafeHash;
  priv key: &'a K;
}

impl<K, 'a> HashWithKey<K, 'a> {
  pub fn key(&self) -> &'a K;
  fn hash(&self) -> &'a SafeHash;
}

pub fn hash<'a>(&self, k: &'a K) -> HashWithKey<K, 'a>

And expose all my methods that take a prehashed key.

@lilyball
Copy link
Contributor

@cgaebel Just looking at that right now, I have no idea what I'm supposed to do with hash(). It returns a struct, but what do I do with that?

The point of mangle() is it can be used to implement things like find_or_insert(). I don't know how I would do that with your hash(). Naturally, find_or_insert() already exists, but sometimes callers need a way to express other operations on top of the hashmap that aren't already provided.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

Oops I may have edited that code into meaninglessness.

The intent is that you can hash a key once and then safely use it for multiple operations on the hashtable without having to rehash in between. That's the only thing mangle gives you, and is artificially limiting.

I already have a lot of operations that take a pre-hashed key, and it'd be nice to expose this infrastructure to the user.

Thinking a bit more about this, it might be a little trickier than I originally thought. Maybe you can see a nicer way of doing it?

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

Hmm if I punt on making it safe I could provide an 'unsafe_hash' that hashes a key, and just force people to wrap their custom HashMap primitives in unsafe blocks. But that's kinda sad.

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

There's only one use of mangle in both rustc and servo, and this is to create an update_or_insert abstraction, which arguably should go directly in HashMap in a new method. Even without this, it could arguably be created without a special method, and the only cost would be a redundant hashing.

},
Some(idx) => {
let (_, v_ref) = self.table.read_mut(&idx);
*v_ref = v;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this method the same as insert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Good point! That means servo's only use of mangle:

                self.idmap.mangle(new_id, abstract_self,
                                  |_, new_node: AbstractNode| -> AbstractNode {
                                      new_node
                                  },
                                  |_, old_node: &mut AbstractNode, new_node: AbstractNode| {
                                      *old_node = new_node;
                                  });

should have been:

                self.idmap.insert(new_id, abstract_node);

@huonw
Copy link
Member

huonw commented Mar 30, 2014

There's only one use of mangle in both rustc and servo

Is that measured after mangle was removed?

@cgaebel
Copy link
Contributor Author

cgaebel commented Mar 30, 2014

@huonw When I landed the HashMap patch, hashmap.rs was the only file that was changed. Therefore mangle wasn't used by rustc.

Since I found one usage of mangle in servo (which was better written .insert), it's not used there either.

@alexcrichton
Copy link
Member

Closing due to inactivity, but feel free to reopen with a rebase!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HashMap.mangle no longer exists
7 participants