Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP/RFC: Conversion/elimination of sort related functions (cf. #1850, #1539) #2066

Closed
wants to merge 3 commits into from

Conversation

kmsquire
Copy link
Member

This patch was prompted by the discussion in #1850, though it goes a bit beyond that:

  • Remove
    • insertionsort,
    • quicksort,
    • mergesort,
    • timsort
  • in favor of
abstract SortingAlgorithm
type QuickSort <: SortingAlgorithm end
type InsertionSort <: SortingAlgorithm end
type MergeSort <: SortingAlgorithm end
type TimSort <: SortingAlgorithm end

function sort(InsertionSort,v)
...
end

function sort(QuickSort,v)
...
end

...
  • Removed underscores from search related function names:
    • sort_by -> sortby
    • issorted_{by,r} -> issorted{by,r}
    • select_{by,r} -> select{by,r}
    • search_sorted_* -> searchsorted*

There is one general and a couple of specific issues I need feedback on:

  1. What do people think of the sort(SortName, v) syntax? My opinion is that it's clearer to write and use, but I could go either way.

  2. @in_place_matrix_op is a macro which creates versions of a function which operate along a dimension of a matrix. In this patch it is simplified to inline code generation. However, shuffle in combinatorics.jl also used this macro, and I'm not sure best how to proceed.

  3. at @toivoh's suggestion in rename: sort_by => sortby ? #1850, the function signature for specialized sorts is based on

    function sort(::Type{MySort}, v::Vector)

    However, since various forms of this function (sort, sortby, sortr, etc.) are produced using code generation, I had to jump through some hoops in order to properly prepend the type to the function arguments (see, e.g., https://github.com/JuliaLang/julia/pull/2066/files#L3R85). Is there a better way to do this?

There might be a very slight performance hit (up to a few percent), though the machine I tested this on was under load, so it's not clear.

Any other general comments/suggestions welcome.

* Remove insertionsort, quicksort, mergesort, timsort in
  favor of sort(InsertionSort,...), sort(QuickSort,...), etc.
* Removed underscores from search related function names:
  * sort_by -> sortby
  * search_sorted_* -> searchsorted*
  * issorted_{by,r} -> issorted{by,r}
  * etc.
@johnmyleswhite
Copy link
Member

+1 for these names

@nolta
Copy link
Member

nolta commented Jan 18, 2013

Re: the sort(SortName, v) syntax, another option might be to put each algorithm in a separate module, i.e., Timsort.sort(), Mergesort.sort(), etc.

@ViralBShah
Copy link
Member

I am ok with this - and would like to merge it. @kmsquire Can you rebase?

@StefanKarpinski
Copy link
Member

This is a big change, let's not get antsy and jump the gun here. I have a few comments – I've been thinking about this.

@StefanKarpinski
Copy link
Member

Ok, so some thoughts. I'm certainly on board with the renaming part of this. The sort(Algorithm, ...) part is a tougher call. Clearly, writing sort(QuickSort, v) is longer and less clear than writing quicksort(v), so there should be some big wins if we're going to go with the former.

To me the wins would come from the multiplication of sorting algorithms by generic functions that can be parameterized by those algorithms. For example, the following generic functions can be parameterized by sorting algorithms in this branch:

sort,
sort!,
sortby,
sortby!,
sortr,
sortr!,
sortperm,
sortperm!,
sortpermr,
sortpermr!,
sortpermby,
sortpermby!

Previously, sortpermr!(MergeSort,v) was expressed as mergesortpermr!(v). We didn't even have all of these since that would have been 12 x 3 = 36 functions. Since the new version has these 12 generic functions plus 3 sorting types, we're saving 21 "things". So that's a clear savings.

The two questions for me are:

  1. Do we need all these combinations?
  2. Can we express this in a simpler way?

I'm not sure if we really need all these combinations, but it seems kind of good to be able to compute, e.g. sort permutations, using different sorting algorithms easily. Sometimes you need that kind of control. I'm not sure if we can express this in a simpler way, e.g. using higher order functions.

One thought I've had is taking this even further: expressing sortby and sortr types of things using sorting types. Something like this:

sort!(By, v)
sort!(Reverse, v)
sort!(By{QuickSort}, v)
sort!(Reverse{MergeSort}, v)

This would allow us to pare down the number of sorting functions to just these four:

sort,
sort!,
sortperm,
sortperm!,

while adding only two more objects – Sort.By and Sort.Reverse. That gets us down to 4 functions and 5 types, so 9 things total. However, it's a bit of a weird interface to sorting, so I'm not sure it's worth it.

Another thought on reverse sorting is that we might not need a function for it – instead, you can just sort and then reverse, as long as we have a very fast, optimized reverse! function, that could be acceptable.

@StefanKarpinski
Copy link
Member

I forgot TimSort, which blows these numbers up even more.

@ViralBShah
Copy link
Member

@StefanKarpinski The suggestion is compact, but definitely a bit weird. Rather than get rid of the sortr, it could just be hidden in the module for whoever wants it.

How about sort(MergeSort{Reverse}), sort(MergeSort{By})?

StefanKarpinski added a commit that referenced this pull request Jan 23, 2013
Refactor of sorting, taking #2066 to its logical conclusion.
StefanKarpinski added a commit that referenced this pull request Jan 24, 2013
Refactor of sorting, taking #2066 to its logical conclusion.
@kmsquire
Copy link
Member Author

Included in/superceded by f4b378b (Thanks Stefan!)

@kmsquire kmsquire closed this Jan 24, 2013
@kmsquire kmsquire deleted the sort-updates branch January 24, 2013 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants