-
-
Notifications
You must be signed in to change notification settings - Fork 746
std.algorithm.searching: minmaxElement #4248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
std/algorithm/searching.d
Outdated
| return minmaxElement!(map, selector)(r, seed); | ||
| } | ||
|
|
||
| private auto minmaxElement(alias map = "a", alias selector = "a < b", Range, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why not to use common
mapinstead? - Prototype with 2 seeds, one for max and one for min looks more useful
- Why we need this function? is it faster then reduce!(min, max)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: 3. map!(...).cache.reduce!(min, max)
|
@9il - the idea comes from #4221 and @nordlow and I do prefer If I think about it should be possible to templatize |
It all comes down to having the original element not the one mapped by a predicate. |
It will be faster than calling vs. |
|
Why not this: alias minValue = (a, b) => a.value < b.value ? a : b;
alias maxValue = (a, b) => a.value > b.value ? a : b;
assert([3, 4, 5, 1, 2].enumerate.reduce!(minValue, maxValue) == tuple(tuple(3, 1), tuple(2, 5)));? |
|
could you please add prototype with 2 seeds? |
b612065 to
3e208f4
Compare
done - I removed the one seed prototype as it didn't make sense to me anymore. |
3c2bad9 to
cf06f65
Compare
|
Needs a changelog entry. |
|
std/algorithm/searching.d
Outdated
| } | ||
|
|
||
| /// | ||
| //@safe pure unittest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix this and the stdio import
|
I think this function is of limited value. But, because there is an STL implementation adding this will help people transitioning from C++. LGTM sans comments |
| If the extreme element occurs multiple time, the first occurrence will be | ||
| returned. | ||
| This function is more efficient than calling both $(LREF minElement) and | ||
| $(LREF maxElement). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could go into a little more detail here. I think you should add something like
This function is more efficient than calling both
$(LREF minElement) and $ (LREF maxElement) for one range because this function only requires one scan of the range, whereas the former takes two. Also, calling both$(LREF minElement) and $ (LREF maxElement) on the same range would require it to be a forward range.
This would help people understand this function's benefits more clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More precisely, we should provide guarantees similar to the C++ version - per http://en.cppreference.com/w/cpp/algorithm/minmax_element: "At most max(floor(3/2(N−1)), 0) applications of the predicate, where N = std::distance(first, last)."
|
Ah, cool, I'd forgotten about the algorithmic trick. But I don't seem to see you applying it. The pattern goes: if (r[i] < r[i + 1])
{
if (r[i] < r[min]) min = i;
if (r[i + 1] > r[max]) max = i + 1;
}
else
{
if (r[i + 1] < r[min]) min = i + 1;
if (r[i] > r[max]) max = i;
}It's actually one of the FB interview questions :). So yes I approve the addition, but for the life of me I don't see where you implement the correct algorithm above. |
Yeah I tried it (see benchmarks below) and the "stupid" way seems to be a lot faster. https://github.com/gcc-mirror/gcc/blob/gcc-7_1_0-release/libstdc%2B%2B-v3/include/bits/stl_algo.h#L3332 I currently don't have time to dive more into it, but here are the benchmarks to compare the naive version and iteration in pairs: RandomAccessInputRange |
|
Heh, thanks @wilzbach. I've reproed your measurements. This is an interesting result. I'm not sure exactly what's going on yet. Take a look at https://godbolt.org/g/mXj3uW. There we have:
I'd say let's add minMaxElement with the guaranteed 3n/2 comparisons (which may be arbitrarily expensive); otherwise there is no merit to it over reduce!(min, max). We may specialize it for certain data types and the default comparison to take the brute force approach. Compiler experts @ibuclaw @JohanEngelen @klickverbot please take a look! FWIW dmd generates equally good/bad code for minmaxElementNoMap2 and minmaxElementNoMapInPairs2. Didn't try gdc yet. |
|
@andralex: Do you have a benchmark script for your experiments? |
@klickverbot: sorry that my link was so hidden at the end of the post. It's here: https://gist.github.com/wilzbach/3407d80bfa757d46a3ac59a873d5f085 |
|
@wilzbach: Thanks, but I was referring to Andrei's experiments in particular because I'm lazy. I guess I need to copy-paste over his code myself after all… ;) |
|
@klickverbot pasted the mess here: https://dpaste.dzfl.pl/09fbcf17f932 |
|
@andralex For the InPairs implementations, shouldn't the loop advance in pairs (+2)? (and then some extra work for odd length) |
|
@JohanEngelen I ignored the odd elements case, it has no bearing on measuring efficiency. Per lines 124 and 202, I advance in 2 increments and look at i - 1 and i in one pass. If you find any bug I'd be relieved! I'm getting results that are difficult to interpret. |
| { | ||
| alias mapFun = unaryFun!map; | ||
| alias selectorFun = binaryFun!selector; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(!selector(maxSeed, minSeed));| { | ||
| maxElement = r[i]; | ||
| maxElementMapped = mapElement; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we should use the 3n/2 algorithm, even if technically slower for < and int. One great thing to do would be to specialize for this (and a few other) cases.
| MapType mapElement1 = mapFun(rawElement1); | ||
| r.popFront(); | ||
| // check if the range had an uneven amount of elements and thus has ended | ||
| if (r.empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BUG: must return at the end of this if.
|
@wilzbach one more thing - where did you see the |
|
@andralex The code you linked to (#4248 (comment), https://godbolt.org/g/mXj3uW ) does not do the +2. |
|
@andralex Did you trying "caching" for (size_t i = 0; i < r.length; i += 2)
{
uint j = selectorFun(r[i], r[i + 1]);in case the compiler cannot/doesnot deduce that |
|
@JohanEngelen tried that, makes no difference. BUT! I found what seems to be an interesting performance bug. I stripped minmaxElementInPairsNoMap all the way down to this core loop: for (size_t i = 0; i < r.length; i += 2)
{
}Even if it literally does nothing, it still takes more than 2 times longer than for (size_t i = 0; i < r.length; i += 1)
{
}So ldc does not generate good code for loops that advance in a non-unit increment. I think you'd improve the life of many if you looked into that! |
|
@andralex That's because of integer overflow. If Edit: This kind of stuff is a lot of fun to work on @andralex ! Especially with such supertiny test cases. Hope you remember to file such things in our bugtracker ;-) ;-) |
|
@JohanEngelen fwiw I changed the implementation to use @wilzbach so let's stay with the |
|
I never needed this and lost interest in pursuing this PR. Sorry |
follow-up to #4221
An efficient combination of
{min,max}Element.Can be found in C++ as well:
http://en.cppreference.com/w/cpp/algorithm/minmax_element
Ping @nordlow.