-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove "use axis after broadcasting" and change it or unspecify it #724
Comments
I will note that this applies to |
I agree with your general point about not interpreting |
The point is changing the text here. It is a bit unclear, but does seem to suggest to first broadcast. And I suspect that was the intention when it was written. EDIT: To be a bit clearer:
Notes "broadcasting", which to me suggests that the axis of the broadcasted arrays is taken much as if we were doing |
Ah, I see, yes, that text is inconsistent with the |
It is the same issue yes. But to note that this needs resolution (or rather for the sake of NUmPy, I will assume it will be resolved by either aligning with NumPy or unspecifying it). |
Looking at this and #617 again. I agree with @seberg here. I didn't realize that gufuncs use the axis before broadcasting. If gufuncs are doing this, then that makes this case stronger since the spec is disagreeing with gufunc behavior here. So I agree completely that we should change the spec to only specify axis when it is negative. Users can work around this by manually broadcasting first, but if it really becomes an issue, I would suggest adding In short, specifying an axis uniformly for multiple arrays (i.e., I will also note here that my implementation of >>> np.vecdot(np.empty((1, 2, 3, 4, 5)), np.empty((3, 4, 5)), axis=2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: vecdot: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n),(n)->() (size 5 is different from 3)
>>> import numpy.array_api as xp
>>> xp.vecdot(xp.empty((1, 2, 3, 4, 5)), xp.empty((3, 4, 5)), axis=2)
Array([[[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]]], dtype=np.array_api.float64) (I also apparently never implemented broadcasting for As far as I can tell, this is only an issue for I think it would be good to get this fixed for 2023 if possible. I will make a pull request. |
Nonnegative axes and negative axes less than the smaller of the two arrays are unspecified. This is because it is ambiguous in these cases whether the dimension should refer to the axis before or after broadcasting. Preciously, the spec stated it should refer to the dimension before broadcasting, but this deviates from NumPy gufunc behavior, and results in ambiguous and confusing situations, where, for instance, the result of a the function is different when the inputs are manually broadcasted together. Also clean up some of the cross text a little bit since the computed dimension must be exactly size 3. Fixes data-apis#724 Fixes data-apis#617 See the discussion in those issues for more details.
Fix at #740 |
Thanks, that all makes sense. Just for completeness, axes New in version 1.15. A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of (i,j),(j,k)->(i,k) appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be [(-2, -1), (-2, -1), (-2, -1)]. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted. |
Yes, I rather like that feature, although I'm not sure if it's appropriate for the array API. Maybe if a use-case comes up. |
I think swapping axes is fine, for |
This commit updates specification guidance in `vecdot` and `cross` to no longer explicitly support positive `axis` kwarg values. Previous specification guidance conflicts with NumPy gufuncs and restricting to negative integers removes ambiguity in determining over which axis to perform computation. This commit uses `should`, not `must`, to allow conforming libraries to support nonnegative `axis` values for backward compatibility. Closes: #724 Closes: #617 PR-URL: #740 Reviewed-by: Athan Reines <kgryte@gmail.com>
@mhvk noticed here that the definitions here are incompatible with NumPy gufuncs which I think is an oversight that was never noted/discussed. The "after broadcasting" looks reasonable on first sight, but TBH, doesn't really seem reasonable to me:
Assume broadcasting does something, you are specifying an axes that by definition has length 1 along the array, and also by definition has little meaning for the array which gets broadcast.
Now, is the alternative of using the non-broadcast axes better? Maybe not, maybe it is also very unclear and any positive axes (i.e. where broadcasting matters) is just too strange.
But, whether you like what NumPy does or not after thinking about it, it does seem if anything at least (maybe moreso) logical than the alternative.
So, I propose to change any place where broadcasting is mentioned w.r.t. to axes handling to define it as unspecified. Negative axis remain allowed, as it doesn't interfere with broadcasting. Negative axis beyond the number of existing dimensions prior to broadcasting will also be unspecified.
The last point looks necessary to me and strengthens the argument for NumPy's (or unspecified/error) logic:
matmul([1], [[2]], axis=(-2, -1))
should not be valid, since it might as well be a bug that the first array is not 2-D as it should be).EDIT: The above example should use
vecdot
, matmul doesn't work:vecdot(1, [2], axis=-1)
The text was updated successfully, but these errors were encountered: