-
-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement parallel::partition_copy. #2716
Conversation
5a1d921
to
12701f5
Compare
@taeguk: invoking the predicate twice for each element is not allowed, see here: http://en.cppreference.com/w/cpp/algorithm/partition_copy
|
@hkaiser You're right. I did not know that. |
Yes, I agree. I think the only viable solution is to allocate that additional array of Booleans as you had it in the first place. |
// sequential partition_copy with projection function | ||
template <typename InIter, typename OutIter1, typename OutIter2, | ||
typename Pred, typename Proj> | ||
inline std::pair<OutIter1, OutIter2> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small hint: member functions and templated functions are implicitily inline
{ | ||
while (first != last) | ||
{ | ||
if (hpx::util::invoke(pred, hpx::util::invoke(proj, *first))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should perfect forward callable objects since those can be overloaded on r-value references:
struct my_callable {
void operator()()&& {
// ...
}
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Naios: Not sure if all of our compilers support that. We should add a feature test if we want to use this. Also, if we do, this could be applied in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Naios: To clarify, I meant the operator()() &&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Naios: also, in this context, perfect forwarding wouldn't work as pred
(and proj
) shouldn't be invalidated (as it might - if moved); pred
will be used more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hkaiser I guess all required compiler versions should support it:
- GCC since 4.8 or 4.9
- Clang since 3.3 or 3.4
- MSVC since 2015 (19.0)
However, it is safe for the future to support perfect forwarding for callable objects.
// EDIT Because of the second comment: yes that's right
Also, the inspect tool is not happy with your code: https://7095-4455628-gh.circle-artifacts.com/0/tmp/circle-artifacts.KS6bV0d/hpx_inspect_report.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to keep the work on the scan_partitioner
separate from the partition_copy
algorithm implementation.
@hkaiser Sorry, this is my mistake. This is just for backup. This commit will be removed. |
@taeguk: could we merge this step by step? Could we merge the current state of Wrt bad performance: I'd suggest to make the change from |
@hkaiser I had already done such a test. With I stopped the progress of this PR because of #2733. Do you want that I will finish this PR without adapting to #2733? |
@hkaiser Sorry, I have one thing to do for Ranges TS. I will add I have a question. As you think, which one is better between "one PR for both an implemention of parallel algorithm and an adaption for Ranges TS" and "divide one PR into two for each"? |
… unit tests for parallel is_heap, is_heap_until, and partition_copy.
…s of parallel::partition_copy. And fix some tiny things.
@hkaiser Finally, I'm ready to be merged! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
@taeguk: Have you tried whether you get repeatable results when forcing the same seed? The different results could be caused by a different amount of work needed to perform the partitioning. |
@hkaiser Oh, I got almost the same results when forcing the same seed! Great :) |
@taeguk: Thanks a lot for your work on this. That's much appreciated! Please get in contact with aserio on IRC so he can send you a STE||AR-group t-shirt ;) |
Check Box
policy.executor()
instead ofhpx::launch::sync
inscan_partitioner
.** Issue List **
scan_partitioner
usepolicy.executor()
).Note
2017/6/25
I implemented
partition_copy
with reference tocopy_if
.The behavior is good, but the benchmark results are very bad.
I must find new efficient way and re-implement
partition_copy
2017/7/4
The bad performance is due to #2325.
With using
policy.executor()
instead ofhpx::launch::sync
inscan_partitioner
, the performance is good. (https://gist.github.com/taeguk/6abe03f9b4cb878872d2bb634cae65b0)