-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FlatMap, a vector map for extra #9653
Conversation
None => { | ||
let v = not_found(&k, a); | ||
self2.data.push((k,v)); | ||
match self2.data.mut_rev_iter().next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must be a better way to do this, but I struggled to find one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this compile?
match self.find_mut(&k) {
Some(val) => {
found(&k, val, a);
return val
}
None => ()
}
let v = not_found(&k, a);
self.data.push((k,v));
match self.data.mut_rev_iter().next() {
Some(&(_, ref mut val)) => val,
None => unreachable!(),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, but I was referring to the match self.data.mut_rev_iter().next()
line, wanting to find a better way to do that.
src/libextra/flatmap.rs:71:8: 71:17 error: cannot borrow `(*self).data` as mutable more than once at a time
src/libextra/flatmap.rs:71 self.data.push((k,v));
^~~~~~~~~
src/libextra/flatmap.rs:63:14: 63:19 note: second borrow of `(*self).data` as mutable occurs here
src/libextra/flatmap.rs:63 match self.find_mut(&k) {
^~~~~
src/libextra/flatmap.rs:72:14: 72:23 error: cannot borrow `(*(*self).data)[]` as mutable more than once at a time
src/libextra/flatmap.rs:72 match self.data.mut_rev_iter().next() {
^~~~~~~~~
src/libextra/flatmap.rs:63:14: 63:19 note: second borrow of `(*(*self).data)[]` as mutable occurs here
src/libextra/flatmap.rs:63 match self.find_mut(&k) {
|
Maybe. It looks like smallintmap's keys are the indexes into the vector, so it can be even smaller than FlatMap under the right conditions. |
This is some nice code, thanks! That being said, I'm not sure how many more containers libextra needs at the moment. This would be an excellent candidate for the "package incubator" type situation that we envision for libextra, but I'm not certain that this needs to be merged at this time. I could be wrong though, does this have a good use case that I'm not seeing? It seems that this has O(n) on almost all operations which seems undesirable? I could be overlooking something though. |
@sfackler: |
Oh yeah, nevermind. |
@alexcrichton I haven't benchmarked this implementation, but this type of container is generally faster than a hash map for small data sets like say http headers. I'm not taking advantage of sorting yet, but if you have sorted keys this type of container can be great for large immutable data sets too where the overhead of holes in a hash map becomes costly. Prior art: |
smallintmap is great for things like file descriptor to waiter mappings where you're fine possibly trading a little memory or insertion speed for a lot of speed on lookup. |
Oh interesting! This would actually be an excellent type to use in TLS currently as well... I'm always a fan of more cool stuff in rust, and others may want to weigh in on this as well, but this feels like it should wait for our "incubator" to manifest itself. |
I added some benchmarks and modified the code to keep the vector sorted. Before sorting:
After sorting:
|
Is it worth having the extra restriction of |
Good point, |
Erm, that's not a relaxation: fewer types are I actually meant this should be totally unordered, since we have other maps that cover that case. I.e. this should be a "pure" association list, which is the simplest map, and the only one that works for types that only implement |
@toffaletti: awesome! I kept meaning to add something like this, but never got around to it. Two comments:
|
@huonw I see, I thought you just meant it was silly I had @erickt boost calls it flat_map. The wikipedia link you posted says an association list is a linked list. flat map is a sorted vector, more like an array backed binary heap. FlatMap exploits how amazingly fast modern hardware is at accessing contiguous memory. It is memory efficient and faster than you might expect. |
I turned the benchmark code into a macro and added TreeMap, but it seems to be misbehaving. I randomly see benchmarks take 0 ns/iter. I couldn't find a bug reporting that problem, can anyone verify? I've made the imports in flatmap.rs so you can just |
If you see 0 ns/iter it often means that your iterations are taking too long, I believe the current threshold is 1ms for the maximum amount of time your iterations should take. |
I made https://github.com/toffaletti/rust/tree/flatmap-used to test some real-world usage of FlatMap by replacing libextra usages of TreeMap with FlatMap. It exposed some issues with FlatMap that I've fixed to make stage2-check pass. One thing worth mentioning is the changes ended up being much farther reaching than I'd hoped or expected because many things using the json module were depending on it using TreeMap to represent json::Object which seems like a code smell. I'm not sure how to quantify this for merging, mostly it was just a fun and educational experiment for me. I think @alexcrichton is probably right saying it can wait. |
As I mentioned on #9816, if we don't end up merging this, I would highly recommend putting this in a rustpkg repo on http://hiho.io/rust-ci/, because I'm sure others will find this useful! |
I too think this is a good data structure, but that we should not put it in libextra since it is soon to be dissolved and moved to various other places, and this probably doesn't belong in std. It would be nice to have a discussion of what data structures we need in std, and whether we should have a single, well-supported external package for more advanced or specialized data structures like this one. |
Closing for the above reasons. |
fix `box-default` linting `no_std` non-boxes This fixes rust-lang#9653 by doing the check against the `Box` type correctly even if `Box` isn't there, as in `no_std` code. Thanks to `@lukas-code` for opening the issue and supplying a reproducer! --- changelog: none
I don't expect this to be merged as-is, but I'd like some eyes on what I've done so far. I'd like to implement a FlatSet in this file also to have as much feature parity with hashmap.rs as possible.