Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do about order() #176

Closed
tshort opened this issue Feb 2, 2013 · 10 comments
Closed

What to do about order() #176

tshort opened this issue Feb 2, 2013 · 10 comments

Comments

@tshort
Copy link
Contributor

tshort commented Feb 2, 2013

order() is no longer a Base function. It has been superseded by sortperm. Also, we don't have something like R's order that can order by multiple vectors. We use order in a few places, so how should we fix this? Options might include:

  1. Replace order with sortperm.
  2. Define order based on sortperm. If so, how much of sortperm's flexibility should we pull in?
  3. Extend either of sortperm or order to handle multiple vector ordering.
  4. Extend sortby to cover multiple columns.

I prefer having order (I don't like the name sortperm).

@johnmyleswhite
Copy link
Contributor

I really dislike the name sortperm now that it's become the name for order and not something more interesting. I would prefer that we extend sortby because I feel like order should be kept as an atomic function that doesn't do any ordering, but tells you the operations needed to put things in order. In contrast, I think of sortby as doing the ordering.

@tshort
Copy link
Contributor Author

tshort commented Feb 2, 2013

I think we're on the same page. Okay to have order and sortby with sortby calling order?

@johnmyleswhite
Copy link
Contributor

Definitely. I would like order to be restored as the primitive.

@StefanKarpinski
Copy link
Member

Order strikes me as an extremely overloaded term. I have to confess that I don't really get the objection to sortperm. Is the issue its new behavior or the name itself?

@johnmyleswhite
Copy link
Contributor

My concern is twofold:

  • The behavior is strange, because we now use a Matlab jargon term but don't perform the equivalent Matlab action.
  • The name is strange because sortperm isn't actual a mathematical term: it's Matlab jargon that R users would never even consider trying out when looking for the order function. order is a proper mathematical term, although, as you've said it's badly overloaded. But it is the term people would use in a textbook, unless they used the term rank, which is really the term used in most statistics textbooks. That term is also overloaded, but it is consistent with the principle of Calling Things What They Are Called.

@StefanKarpinski
Copy link
Member

I don't think that sortperm is actually from Matlab (@ViralBShah, @JeffBezanson?), I think we just chose the name at some point and it stuck – previously for sorting and returning a permutation. In Matlab you just use the sort function with two arguments for that and do [~,p] = sort(v) for what sortperm currently does. I like the name because it gives a permutation of v that puts it in sorted order, which is, to me, pretty intuitive.

@ViralBShah
Copy link
Contributor

I think order has too many different meanings for different branches of mathematics. sortperm is not actually a matlab name and is something I started using when I first wrote sort.

Perhaps the DataFrames package can have an order function, which also takes care of the NAs and such.

@StefanKarpinski
Copy link
Member

We should use the same name in base and data frames.

@tshort
Copy link
Contributor Author

tshort commented Feb 3, 2013

When I first saw sortperm, I thought "sort permanently". I only made the connection to order by digging through git history. Even when I discovered perm meant permutation, I had to go to the dictionary to find what it meant. Not a big deal, though. With the new sortby capability that Kevin Squire just added, that reduces the need for order.

@tshort
Copy link
Contributor Author

tshort commented Feb 11, 2013

I'm going to close this issue. Kevin's additions (#177) get us consistent with base and get us multi-column sorting.

@tshort tshort closed this as completed Feb 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants