Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing 1836, adding parallel::sort #1850

Merged
merged 21 commits into from
Nov 10, 2015
Merged

Fixing 1836, adding parallel::sort #1850

merged 21 commits into from
Nov 10, 2015

Conversation

biddisco
Copy link
Contributor

@biddisco biddisco commented Nov 4, 2015

Only parallel::sort is included at the moment. I will work on stable_sort and sort by key (indirect sort) as time permits. Not expecting a merge right away, but let me know what is wrong with the code/style etc and if I've missed anything crucial.

biddisco and others added 12 commits October 31, 2015 23:50
Original implementation is available at
  git@github.com:fjtapia/sort_parallel.git

HPX algorithm uses dataflow to avoid the use of a stack + spinlock
and executes all steps using async calls which are run on HPX threads.

No need to specify how many threads the algorithm is run on as it will
simply spawn tasks and the thread handling will be taken care of by
the HPX runtime.
… sort implementation

Note : Cherry picked Doc updates from parallel_sort branch
…oured

Lots of cleanups and compilation fixes for windows.
Internal struct holding comparator was going out of scope
when async returns, causing segfaults inside the sort function.

Remove struct and copy the comparator, the minor optimization
of using only one comparator object would only be significant if the user
supplies a comparator with complex internal state, which is unlikely.
{
typedef typename
std::remove_reference<decltype(*(std::declval<RandomIt>()))>::type type;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like, this could be replaced by std::iterator_traits<RandomIt>::value_type.

@biddisco biddisco force-pushed the fixing_1836 branch 3 times, most recently from c2cb807 to e20e482 Compare November 5, 2015 08:19
@biddisco biddisco force-pushed the fixing_1836 branch 2 times, most recently from 7615519 to 6efb0d5 Compare November 5, 2015 10:32
@hkaiser hkaiser added this to the 0.9.11 milestone Nov 5, 2015
- adding projection parameter
- adding more docs
- renaming variables to be lower-case only
- adding concept checks
- fixing (suppressing) various warnings about possible loss of precision
- switching to boost::random for older compilers
- breaking long lines
@biddisco
Copy link
Contributor Author

biddisco commented Nov 6, 2015

I'm not sure why the std::random was changed to boost::random (which caused a link error in the latest build). We already require c++11 which provides <random>.

@hkaiser
Copy link
Member

hkaiser commented Nov 6, 2015

I'm not sure why the std::random was changed to boost::random (which caused a link error in the latest build). We already require c++11 which provides .

Uhh ohh. My recollection was that not all of our platforms support random_device. I apologize if I broke something. I'll see what I can do to rectify the problem asap.

@biddisco
Copy link
Contributor Author

biddisco commented Nov 6, 2015

The improvments look good to me

@biddisco
Copy link
Contributor Author

biddisco commented Nov 7, 2015

I added some tests using std::string, but I am concerned that it can take >30secs for all the combinations to run as there are now quite a few and we need fairly long vectors to stress the test. Feel free to remove some combinations if the test is now too slow.

@hkaiser
Copy link
Member

hkaiser commented Nov 7, 2015

Would it make sense to isolate the std::string tests into a separate test-executable?

@biddisco
Copy link
Contributor Author

biddisco commented Nov 7, 2015

Depends on how strongly people feel about shortness of tests, there are about 60 combinations of test params in the current test (policy, type, comparison) and that's quite thorough, so some could be removed without much chance of letting bugs in. Do we really expect it to pass for sort<int> but fail for sort<double> - I doubt it, so I'll thin the tests down a bit.
Do we have a target test run time? (ie less than N seconds preferred - on an average workstation)

@hkaiser
Copy link
Member

hkaiser commented Nov 7, 2015

We don't have a runtime limitation for single tests. The only known limitation is that Intel compilers tend to bail out on large tests as those instantiate a lot of different templates.

//----------------------------------------------------------------

//------------------- check if sorted ----------------------------
if (detail::is_sorted_sequential(first, last, comp))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more thinking - I believe the is_sorted test should be done for the leaves only (before line 101 above?). Currently the algorithm re-scans the input data over and over again. This can be a big hit for nearly sorted data, mainly.

The current code assumes that testing the data sequence for being already sorted adds less overheads than spawning the two sub-tasks for the two partitions. This might be a valid assumption for kernel threads, it might not be a correct assessment for HPX threads, however.

@biddisco
Copy link
Contributor Author

biddisco commented Nov 7, 2015

Actually, there's no need to call is_sorted at all. If the data is sorted then std::sort at the leaf will do the right thing, if not, it will still do the right thing. I removed it and the tests still pass. The speed difference is not significant because most of the time the data is not sorted and the is_sorted exits almost immediately. I'll remove it.

@biddisco
Copy link
Contributor Author

biddisco commented Nov 8, 2015

I retract my previous comment. For small types (up to 128 bytes) removing is_sorted either makes no differnce, or helps slightly. For larger types and strings removing is_sorted has a significant detrimental impact on performance. I will investigate further.

- this also fixes make_exceptional_future() for arbitrary exception types
hkaiser added a commit that referenced this pull request Nov 10, 2015
Fixing 1836, adding parallel::sort
@hkaiser hkaiser merged commit 81f8020 into release Nov 10, 2015
@hkaiser hkaiser deleted the fixing_1836 branch November 10, 2015 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants