-
-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing 1836, adding parallel::sort #1850
Conversation
Original implementation is available at git@github.com:fjtapia/sort_parallel.git HPX algorithm uses dataflow to avoid the use of a stack + spinlock and executes all steps using async calls which are run on HPX threads. No need to specify how many threads the algorithm is run on as it will simply spawn tasks and the thread handling will be taken care of by the HPX runtime.
… sort implementation Note : Cherry picked Doc updates from parallel_sort branch
…oured Lots of cleanups and compilation fixes for windows.
Internal struct holding comparator was going out of scope when async returns, causing segfaults inside the sort function. Remove struct and copy the comparator, the minor optimization of using only one comparator object would only be significant if the user supplies a comparator with complex internal state, which is unlikely.
{ | ||
typedef typename | ||
std::remove_reference<decltype(*(std::declval<RandomIt>()))>::type type; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like, this could be replaced by std::iterator_traits<RandomIt>::value_type
.
c2cb807
to
e20e482
Compare
7615519
to
6efb0d5
Compare
- adding projection parameter - adding more docs - renaming variables to be lower-case only - adding concept checks - fixing (suppressing) various warnings about possible loss of precision - switching to boost::random for older compilers - breaking long lines
I'm not sure why the std::random was changed to boost::random (which caused a link error in the latest build). We already require c++11 which provides |
Uhh ohh. My recollection was that not all of our platforms support |
The improvments look good to me |
I added some tests using std::string, but I am concerned that it can take >30secs for all the combinations to run as there are now quite a few and we need fairly long vectors to stress the test. Feel free to remove some combinations if the test is now too slow. |
Would it make sense to isolate the |
Depends on how strongly people feel about shortness of tests, there are about 60 combinations of test params in the current test (policy, type, comparison) and that's quite thorough, so some could be removed without much chance of letting bugs in. Do we really expect it to pass for |
We don't have a runtime limitation for single tests. The only known limitation is that Intel compilers tend to bail out on large tests as those instantiate a lot of different templates. |
//---------------------------------------------------------------- | ||
|
||
//------------------- check if sorted ---------------------------- | ||
if (detail::is_sorted_sequential(first, last, comp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more thinking - I believe the is_sorted
test should be done for the leaves only (before line 101 above?). Currently the algorithm re-scans the input data over and over again. This can be a big hit for nearly sorted data, mainly.
The current code assumes that testing the data sequence for being already sorted adds less overheads than spawning the two sub-tasks for the two partitions. This might be a valid assumption for kernel threads, it might not be a correct assessment for HPX threads, however.
Actually, there's no need to call is_sorted at all. If the data is sorted then std::sort at the leaf will do the right thing, if not, it will still do the right thing. I removed it and the tests still pass. The speed difference is not significant because most of the time the data is not sorted and the is_sorted exits almost immediately. I'll remove it. |
I retract my previous comment. For small types (up to 128 bytes) removing is_sorted either makes no differnce, or helps slightly. For larger types and strings removing is_sorted has a significant detrimental impact on performance. I will investigate further. |
- this also fixes make_exceptional_future() for arbitrary exception types
Fixing 1836, adding parallel::sort
Only parallel::sort is included at the moment. I will work on stable_sort and sort by key (indirect sort) as time permits. Not expecting a merge right away, but let me know what is wrong with the code/style etc and if I've missed anything crucial.