-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
permutation method in Random module overwrites the input array with permuted indices of the domain #16400
Comments
I would tend to agree that, to me, |
We already have I think it would be fine to change |
Hi @bradcray,
|
Here, are some of the methods for generating random permutations in a few other frameworks that I could find: Numpy
Matlab
Go |
@CaptainSharf - Thanks for bringing this up and providing a review of some other languages. I think just changing the behavior of our FWIW, this would follow the convention from numpy, where |
If we were to pursue an /* Return an array containing the shuffled elements of ``arr``. \
If ``inPlace == true``, shuffle the elements in place and return nothing. */
proc shuffle(arr: [], param inPlace=false, ...) { }
/* Return an array containing the shuffled indices of ``dom``. */
proc shuffle(dom: domain, ...) { } @mppf has already expressed opposition to this proposal offline on the basis of not liking arguments impacting the function signature. It's probably not worth discussing |
@ben-albrecht , Thanks for the feedback!!
|
@CaptainSharf - Yes, that is correct. Another design element: How can we properly deprecate the existing Whatever we decide on, we ought to update the deprecation guidelines to cover this special case for future deprecations. |
@ben-albrecht , yes it seems to me we won't be able to deprecate the existing behaviour easily when an array is passed. There would not be an issue in introducing an overloaded version of permutations which takes domain as argument. Since shuffle randomly permutes the array, one can still accomplish the intended behaviour we're looking for permute by saving the state of the array before shuffling and restoring it later. |
Right, for the deprecation in this case, we can just provide |
To deal with the deprecation issue, I'd be inclined to name this routine Reading some of the examples and notes made me wonder whether these routines can/do work with multidimensional domains / arrays (and whether they do their permutations in a dimension-independent way, or whether a permutation would permute rows and columns without completely scrambling everything; for arrays, I'd expect the former, but would want to know what other multidimensional permutations do).
This seemed surprising to me. Zero might be a valid index of a given array (e.g.,
If the user happened to have an unused array that was the right size and distribution, passing in the array would save some time, but I'm not sure that use case is common enough to warrant complicating the interface at this point. We could always add an overload that took a domain and array later on if such cases arose in practice and were causing performance issues. This allows us to kick concerns like size mismatches, domain mismatches, different distributions down the road as well. |
Hi, @bradcray,
|
However given that, I think it is possible to shuffle the contents of the array independent of the dimension. We could use the Fischer-Yates algorithm over the multi dimensional array by mapping an N-D array index to a 1-D index. /* shuffles the array independent of the dimension */
proc shuffleDeep(arr:[?D]){
for i in 0..arr.size by -1{
var flatRandIdx = randlc_bounded(D.idxType, PCGRandomStreamPrivate_rngs,seed, PCGRandomStreamPrivate_count,0, i);
var randIdx = getNdIndices(arr,flatRandIdx);
var currIdx = getNdIndices(arr,i);
arr[randIdx] <=> arr[currIdx];
}
}
/* Transforms a flat index to an array index */
proc getNdIndices(arr:[], in flatInd){
var idx : [0..arr.domain.rank-1] int = -1;
var div = arr.size;
for i in 0..arr.domain.rank-1 by -1{
div/=arr.domain.dim(i).size;
const lo = arr.domain.dim(i).low;
const stride = abs(arr.domain.dim(i).stride);
const zeroInd = flatInd/div;
const currInd = lo+(zeroInd*stride);
idx[i] = currInd;
flatInd = flatInd%div;
}
return idx;
} |
I would say that this behavior isn't appropriate for Chapel or languages with true 2D arrays, and that it makes more sense for languages in which a 2D array is really a 1D array of 1D arrays (which, I would argue, the formatting of your Python example suggests it has). And I agree with the approach of thinking of the domain's indices as being in a 1D / ordinal space, permuting there, then transforming back to the multidimensional space. Some existing routines (like, I think, indexToOrder and orderToIndex, though I may not have those names right, and don't have time to check right now) should help with this. |
I'd be OK with @bradcray - I agree with your take on multi-dimensional shuffling. Further discussion on this might benefit from a separate issue so that we can keep this issue/task scoped to deprecating @CaptainSharf - FYI you can add start your code blocks with |
Hi, |
I thought we had the reverse operation as well, but am not finding it with a quick look... (I think we should). As far as I know, these are not supported on associative domains at present. |
I found the method |
Hi @ben-albrecht @bradcray , |
If you have working copies of those routines, why not add them to the |
Hi @bradcray , @ben-albrecht , @mppf , |
I don't see any problem with adding that. |
Sounds great, thanks @CaptainSharf |
H! |
Deprecates `randomStream.permutation` and the top level `permutation` procedures in favor of a new `permute` interface. The new methods have the following signatures: ```chapel proc permute(a: [?d] ?t]): [d] t { ... } proc permute(d: domain(1,?)): [] d.idxType { ... } proc permute(r: range(bounds=boundKind.both, ?)): [] r.idxType { ... } ``` In each case, an array with a random permutation of the argument is returned. This is different than the old interface, which overwrote the array argument with a permutation of the array's domain. Resolves: #16400 - [x] paratest [ reviewed by @jabraham17 ] - thanks!
@jeremiah-corrado : Just noticing that this is closed now with your recent work. Would you check whether open PR #17053 which looks like it was filed for this issue has additional value or could/should be closed now? |
The multidimensional shuffle/permute implementation in that PR came from a somewhat tangential discussion in this issue. The original topic was only about changing the The PR itself is pretty out of date wrt what we have on main at this point. I think it can be closed but referenced from the issue. |
Summary of Problem
Invoking the permutation method on an array overwrites the array with permuted indices of its domain. Shouldn't the method return a new permuted version of the input array? Would it be a good idea to modify the current permutation API to take an extra parameter(like inPlace=true) to determine if the input array is to be permuted inplace (like shuffle)?
Steps to Reproduce
Compile and run the below source code
Source Code:
Compile command:
chpl permutations.chpl --fast
Execution command:
./permutations.chpl
Output
Array before permuting is 0.16559 0.754886 0.615159 0.171141 0.539561
Array after permuting is 1.0 5.0 4.0 3.0 2.0
Configuration Information
chpl --version
: chpl version 1.22.1gcc --version
: Apple LLVM version 10.0.1 (clang-1001.0.46.4)The text was updated successfully, but these errors were encountered: