-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTreeSet intersection, is_subset & difference optimizations #64820
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meanwhile I tweaked the order in both match expressions to move first/min before last/max
Property based tests and performance comparison by travis are now cleaned up and as complete as I can think off. |
let mut other_iter = other.iter(); | ||
let other_min = other_iter.next().unwrap(); | ||
let other_max = other_iter.next_back().unwrap(); | ||
let mut self_iter = match (self_min.cmp(other_min), self_max.cmp(other_max)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous method you use the Ord::cmp(x, y)
style and here x.cmp(y)
. Either is fine but consistency is best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never noticed that. Let's count: we have 4 x Ord::cmp, 3 x cmp (counting pairs as one). Before I started messing about in this code, there was just 1 Ord::cmp and 2 cmp. Notice that cmp_opt
acts as a replacement for Ord::cmp
but uses cmp itself.
So I say, use the shorter, member cmp.
Nice! Cool benchmark setup. I only had nitpicks to contribute to the review. Would love if there was a way to write this without .unwrap() (using discriminants for control flow instead), but it is clear enough that they can never panic here. r=me when nitpicks are fixed to taste |
Oh nice! -- Could we add the proptests to the test suite? cc @alexcrichton @nikomatsakis |
I tried several times, but always hit unsavory amounts of indentation, remote else clauses or eRFC 2497. But now I think I saw the light, resulting in a little less code that is more readable (mostly by dropping some of the micro-optimization). Peculiar indentation courtesy of cargo fmt. r=bluss |
{ | ||
(other_min, other_max) | ||
} else { | ||
return false; // other is empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else-part cannot be reached, due to the performance shortcut on top. It's possible to:
- merge this let if with the if let above, but then it's not at all clear to the casual reader that it should return true
- write a panic! explaining this, better than raw unwrap I guess, but pointless extra code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreachable!("message")
is the panic for that, but since we don't need a panic - false is correct, it seems this works just as well.
I don't know what test suites there are, but seeing |
@bors r+ rollup Thanks! |
📌 Commit d132a70 has been approved by |
BTreeSet intersection, is_subset & difference optimizations ...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out. I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around. Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
BTreeSet intersection, is_subset & difference optimizations ...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out. I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around. Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
BTreeSet intersection, is_subset & difference optimizations ...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out. I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around. Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
Rollup of 7 pull requests Successful merges: - #63416 (apfloat: improve doc comments) - #64820 (BTreeSet intersection, is_subset & difference optimizations) - #64910 (syntax: cleanup param, method, and misc parsing) - #64912 (Remove unneeded `fn main` blocks from docs) - #64933 (Fixes #64919. Suggest fix based on operator precendence.) - #64943 (Add lower bound doctests for `saturating_{add,sub}` signed ints) - #64950 (Simplify interners) Failed merges: r? @ghost
BTreeSet symmetric_difference & union optimized No scalability changes, but: - Grew the cmp_opt function (shared by symmetric_difference & union) into a MergeIter, with less memory overhead than the pairs of Peekable iterators now, speeding up ~20% on my machine (not so clear on Travis though, I actually switched it off there because it wasn't consistent about identical code). Mainly meant to improve readability by sharing code, though it does end up using more lines of code. Extending and reusing the MergeIter in btree_map might be better, but I'm not sure that's possible or desirable. This MergeIter probably pretends to be more generic than it is, yet doesn't declare to be an iterator because there's no need to, it's only there to help construct genuine iterators SymmetricDifference & Union. - Compact the code of rust-lang#64820 by moving if/else into match guards. r? @bluss
BTreeSet symmetric_difference & union optimized No scalability changes, but: - Grew the cmp_opt function (shared by symmetric_difference & union) into a MergeIter, with less memory overhead than the pairs of Peekable iterators now, speeding up ~20% on my machine (not so clear on Travis though, I actually switched it off there because it wasn't consistent about identical code). Mainly meant to improve readability by sharing code, though it does end up using more lines of code. Extending and reusing the MergeIter in btree_map might be better, but I'm not sure that's possible or desirable. This MergeIter probably pretends to be more generic than it is, yet doesn't declare to be an iterator because there's no need to, it's only there to help construct genuine iterators SymmetricDifference & Union. - Compact the code of rust-lang#64820 by moving if/else into match guards. r? @bluss
...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by #64383 but none of the ideas there worked out.
I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.
Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.