Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we be implementing __array__? #67

Open
asmeurer opened this issue Oct 15, 2024 · 15 comments · Fixed by #69
Open

Should we be implementing __array__? #67

asmeurer opened this issue Oct 15, 2024 · 15 comments · Fixed by #69

Comments

@asmeurer
Copy link
Member

Right now the array type implements __array__, but just as convenience. This was done back when the package was numpy.array_api.

I'm wondering if we should actually remove this. It isn't part of the array API standard, and could just lead people incorrectly thinking it works. And more notably, it is leading to bugs in array_api_strict itself (like #66) where you can forget to use ._array when passing an array to a NumPy function and it will still work (but maybe not in the same way, if that function also passes the dtype argument for instance).

asmeurer added a commit to asmeurer/array-api-strict that referenced this issue Oct 15, 2024
They were only working because we define __array__, which may be going away
(data-apis#67).
@asmeurer
Copy link
Member Author

I tested removing it and found some more issues where weren't using ._array like in #66. I pushed fixes for this to #68.

At the same time, if you remove __array__, then NumPy converts it to an object array:

>>> import numpy as np
>>> import array_api_strict as xp
>>> np.asarray(xp.asarray([10]))
array(Array([10], dtype=array_api_strict.int64), dtype=object)

Or we can explicitly make __array__ raise an exception, which would make np.asarray(<array_api_strict Array>) fail.

@asmeurer
Copy link
Member Author

A small hiccup here is that we implicitly use this logic in asarray for conversion from lists of 0-D arrays:

res = np.array(obj, dtype=_np_dtype, copy=copy)

I believe that should be supported in the standard, but the logic in asarray seems to check __array__ first. The docs don't mention any kind of fallback exception that can be used https://numpy.org/devdocs/user/basics.interoperability.html#the-array-method. Maybe there's some internal function in NumPy for converting lists-of-lists that we can reuse?

@asmeurer
Copy link
Member Author

Prototype of this without the fix to asarray at #69.

@mdhaber
Copy link

mdhaber commented Nov 8, 2024

If this has to be removed, I'd suggest that the error message at least suggest how such a conversion should be performed (e.g. from_dlpack?). This is causing 5.7k+ test failures in SciPy right now (see scipy/scipy#21828).

@rgommers
Copy link
Member

rgommers commented Nov 8, 2024

+1 to include a hint about how to fix. Writing that hint is hard - which is a problem.

There are more relevant comments on gh-69, so let's keep the discussion there I'd say.

@asmeurer
Copy link
Member Author

asmeurer commented Nov 8, 2024

So do you think this change should be reverted? I can also issue a warning in __array__ instead of removing it outright.

@rgommers
Copy link
Member

rgommers commented Nov 8, 2024

I can also issue a warning in __array__ instead of removing it outright.

A warning doesn't help, the SciPy test suite will turn those straight into errors. And there'll be either zero or a million warnings probably.

@asmeurer
Copy link
Member Author

asmeurer commented Nov 8, 2024

#91

@asmeurer
Copy link
Member Author

asmeurer commented Nov 8, 2024

Can SciPy not disable the specific warning in its tests? This seems to me to be the best way to deliver info to people without breaking stuff.

asmeurer added a commit to asmeurer/array-api-strict that referenced this issue Nov 8, 2024
Removing it caused issues for SciPy
(data-apis#67). I have left the
flag in to make it easy to remove it in the future.

I also considered raising a warning in __array__, but this is also difficult
to handle data-apis#91
@rgommers
Copy link
Member

Can SciPy not disable the specific warning in its tests? This seems to me to be the best way to deliver info to people without breaking stuff.

Quite a few projects run their CI with "turn warnings into errors", which is very useful to get signal and fix issues quickly (especially DeprecationWarning, which is hidden by default otherwise) - but the downside is that CI does break. So once a library like numpy or array-api-strict plans to do something that may take a while to fix, giving a heads up and waiting until it's fixed before doing a release with the warning is the better way to go about it.

I know that is a little annoying, but if this library is to be run in the test suites of the likes of SciPy, scikit-learn, Xarray, etc. then I don't see a way to avoid being careful about these things.

@rgommers
Copy link
Member

@lucascolley put up a draft PR at scipy/scipy#21835, so let's look at getting that landed first to ensure removing __array__ is actually doable without too much trouble.

@rgommers
Copy link
Member

Question: did anyone try implementing the buffer protocol in Python (possible since recently, see PEP 688)? Adding that to array-api-strict's array object makes sense in principle, and it then makes it trivially easy to delete __array__.

@lucascolley
Copy link
Contributor

Evgeni mentioned that in the community meeting yesterday, but would we not have to wait until we can drop Python 3.11?

@rgommers
Copy link
Member

It can be implemented conditionally, only for Python >=3.12. For SciPy I don't see a problem with only testing on >=3.12

@ev-br
Copy link
Contributor

ev-br commented Dec 12, 2024

#102 (comment)
3.12+ indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants