Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFM: Added permute!, permute!! (renamed permute -> permutedims) #2201

Merged
merged 1 commit into from
Feb 8, 2013

Conversation

kmsquire
Copy link
Member

@kmsquire kmsquire commented Feb 6, 2013

These functions apply a permutation to a vector (or strings, whatever) in place. They're slower than simply writing a[p], where p is a permutation, but are useful when memory is a concern. In particular, I'm wanting to use a version of them for sorting DataFrames.

Also added findnext, and defined findfirst in terms of findnext (used by permute!!).

I could just put these in DataFrames, but thought they would be more generally useful.

Unfortunately, the names conflict with permute, which permutes the order of the dimensions of an array. So either permute or permute!{!} should probably be renamed. I don't really like the naming of permute, but it has historical precedence (and matlab users would be very confused if it changed...).

Edit: per the discussion below, permute was renamed to permutedims.

@diegozea
Copy link
Contributor

diegozea commented Feb 6, 2013

Why is the reason of the second ! on permute!! ?

@pao
Copy link
Member

pao commented Feb 6, 2013

Maybe reorder()?

@JeffBezanson
Copy link
Member

The current permute could be a method of transpose with a second argument. That would be nicer. Then the function here can be called permute and permute!, which makes much more sense. findfirst and findnext with an element value argument can now be replaced by search, which already does the same thing for strings.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 6, 2013

permute!! modifies both arguments (i.e., it clobbers the permutation). permute! only modifies the array in place.

The main trick for permuting in place is finding the start of each cycle in the permutation. The literature suggests testing successive indices in the permutation array to see if it is the min element in a new cycle, but these papers were concerned with permuting very small lists, and doing this is really slow.

I also tried keeping a separate BitArray to track indices which have been permuted, but the current implementation is at least 2x faster on arrays of 1,000,000 elements, even when copying the permutation array.

The double-bang version is still useful to export if you want to reuse a permutation and copy! it yourself (which I do in sort! for DataFrames).

Regarding naming: how about permutedims for permute? This would allow matlab users to find it easily enough with tab completion.

I'm also okay with reorder! for permute!, although permute! goes better with sortperm (and reorder! would go better if sortperm were called order, as the R folks would like).

@kmsquire
Copy link
Member Author

kmsquire commented Feb 6, 2013

transpose for permute would be fine too. When there's more of a concensus, I'll update the patch.

@JeffBezanson
Copy link
Member

permutedims is also a reasonable name.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 7, 2013

I've changed permute to permutedims. At some point permute(a,p) should probably become equivalent to a[p], perhaps checking if p is a permutation.

@JeffBezanson
Copy link
Member

We can probably just keep permute!! and call it permute!.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 7, 2013

Can I suggest instead keeping permute!, and leaving but not exporting permute!!? I think that the expected behavior of permute! should be to not modify the permutation.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 7, 2013

But I can change permute!! to permute! if you prefer, and get rid of the current permute! definition.

@JeffBezanson
Copy link
Member

OK, I would like to merge this very soon so we can get the renames in for 0.1.
The commented-out code should either be removed, or add a comment explaining what it does and why it is disabled. It would also be good to have a couple tests.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 8, 2013

Sorry about the commented-out code. I had removed it at one point, but I forgot to push that change.

I finished updating the documentation and added some tests. The tests are currently part of sorting, though it might be good to create a set of combinatorics tests, since not many of those functions are currently tested.

@kmsquire
Copy link
Member Author

kmsquire commented Feb 8, 2013

Just noticed that I forgot to update permutedims/ipermutedims tests.

* these permute an array according to a given permutation
* permute renamed to permutedims
* findnext defined (and used by permute!)
* findfirst defined in terms of findnext
@kmsquire
Copy link
Member Author

kmsquire commented Feb 8, 2013

Fixed now.

@JeffBezanson
Copy link
Member

This is great. The test and doc updates are much appreciated!

JeffBezanson added a commit that referenced this pull request Feb 8, 2013
RFM: Added permute!, permute!! (renamed permute -> permutedims)
@JeffBezanson JeffBezanson merged commit 24a5bac into JuliaLang:master Feb 8, 2013
@kmsquire kmsquire deleted the permute branch February 8, 2013 05:38
@StefanKarpinski
Copy link
Member

Yes, I think that permutedims is a great name – it's super clear. Having an addition argument to transpose might be nice, but it's not nearly as clear to read. I'm even warming to the idea of having permute!! as a function name. I mean, there's no way you're going to use that by accident, right? But it's there if you just happen to need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants