-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add take
specification for returning elements of an array along a specified axis
#416
Conversation
Should this PR be blocked until an initial spec release? |
Regarding |
Two thoughts:
|
@rgommers Before moving forward with this PR, we need to ensure buy-in from PyTorch. The current proposal deviates from supported PyTorch behavior in the following ways:
We'd need to ensure that PyTorch is (1) okay with adding an For NumPy-inspired libraries, the Accordingly, PyTorch is the odd one out here, atm, so would be good to get a feel for potential specification constraints in order to ensure broad API compatibility, while also addressing fancy indexing concerns (indexing arrays via arrays) and the proposal to require an |
This requires a backwards-incompatible change in NumPy as well, when we want to move to support in the main namespace. It'd be a good change to make though. Dask not following the NumPy default is fairly exceptional for Dask, and was probably done because the flattening behavior is indeed not useful (EDIT: it was added in dask/dask#26, no discussion there on I just had a look at SciPy, and it has quite a few |
I looked at all usages in scikit-learn and most inputs to
|
As a rule we don't allow array-like inputs in the array API standard, because (a) it's ill-defined, and (b) it turns out not to be the best idea to allow it anyway. So it's fine to be consistent with that here.
This makes sense to me. And seems useful, based on my analysis of how scikit-learn uses
I don't think there's an issue with progressing this PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kgryte, overall looks good - a few comments.
One formatting comment: please wrap lines at 80 characters, that's standard in Python and far easier to read/review.
Seems OK, since it is compatible with (Plausibly, in a few cases I am not immediately sure whether
|
Except integer array indexing is not in the standard (yet at least), which is why there was so much interest in adding
Good point, I overlooked that. |
As discussed in the Consortium meeting on 22 September 2022, consensus was to limit the For now, the API will not support multi-dimensional arrays for the This PR has been updated to reflect current consensus. |
As discussed in the 3 November 2022 meeting, no objections were raised. This PR should be ready for merge. @rgommers would you mind taking a last look and then approving? |
I went through the whole discussion here again, and it looks like we're all happy and there aren't any remaining blockers. This wasn't an easy one! In it goes, thanks all! |
This PR
take
, an API for returning elements of an array along a specified axis. This API was initially proposed in Proposal: add APIs for getting and setting elements via a list of indices (i.e.,take
,put
, etc) #177.axis
. This is derived from comparison data.axis
keyword argument always be provided, as suggested in comment, but only when the input array has more than one dimension.indices
treat the input array as a flattened 1-dimensional array. If we were to follow similarly, we'd need to specify the flattened layout.indices
argument. NumPy et al allow array-like. Torch requiresLongTensor
. Was not immediately clear to me how accepting we should be for non-arrays, so went with the most conservative choice.indices
argument. Instead, users should use normal slice/indexing operations.Questions
indices
? No, not yet.axis
argument be required? Yes, but only for multi-dimensionalx
.indices
argument be more liberal in the types of allowed arguments? No, should be restricted to onlyarray
values.