Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should implementing __imatmul__ really be enforced? #509

Open
seberg opened this issue Nov 16, 2022 · 10 comments
Open

Should implementing __imatmul__ really be enforced? #509

seberg opened this issue Nov 16, 2022 · 10 comments
Labels
topic: Linear Algebra Linear algebra.
Milestone

Comments

@seberg
Copy link
Contributor

seberg commented Nov 16, 2022

I was not around when this may have been discussed in numpy/numpy#21912. In-place matmul is a weird beast, but for NumPy it would be strange to make @= and out-of-place operator.
Because of this, NumPy doesn't define it currently, and I am unsure that it should be done. Would NumPy have the tight restriction that the shape must fit and we actually assign back to a (i.e. truly in-place at all operators?).

That would make sense, but even then, it would even be slower and use as much memory as before anyway.

@rgommers
Copy link
Member

I think it's just a simple consistency issue. If all operators have in-place versions, then @ probably should too unless there is a very good reason not to. PEP 465 also defines it - and that came from the NumPy community.

Would NumPy have the tight restriction that the shape must fit and we actually assign back to a (i.e. truly in-place at all operators?).

I'd say yes - that seems like the only consistent option for the semantics to me.

It won't be used much, and yes it won't be faster than the out-of-place version. But I don't think that's a good enough reason to not add it. The semantics are well-defined.

@asmeurer
Copy link
Member

I don't think it is required. The spec says in-place operators "may" be supported with __iop__ (https://data-apis.org/array-api/latest/API_specification/array_object.html#in-place-operators). If you don't implement it, then it will fallback to the Python object default which uses __matmul__.

Also, I thought I remembered it only being required for instances where the left-hand array shape wouldn't change (i.e., the right-hand operator is square), but I can't find that now.

@jakirkham
Copy link
Member

Yeah just to back up Aaron's comment. From the Python docs:

If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, if x is an instance of a class with an __iadd__ method, x += y is equivalent to x = x.__iadd__(y) . Otherwise, x.__add__(y) and y.__radd__(x) are considered, as with the evaluation of x + y.

@asmeurer
Copy link
Member

Also https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html#copyview-mutability. When views are avoided, a = a @ b and a @= b are equivalent. I guess a difference would be that the latter, if implemented as __imatmul__ could raise an exception when the shape changes (the spec says that that shouldn't be allowed, so it's out of scope). If a is a view or has a view to it, they aren't the same.

Should the spec say that if views are implemented then in-place operators should actually act in-place? To me it's not necessarily clear that it says that.

@seberg
Copy link
Contributor Author

seberg commented Nov 17, 2022

@asmeurer, that is what tensorflow does, but I think it is misleading. In NumPy:

arr @= arr

is always an error. This ensures that we don't have the one odd "in-place" operator which isn't in-place at all. I do think that is important.

So the question is whether a library can expect arr @= arr to work in the complicated and very strict case where the shapes align well.
From the NumPy perspective, I really don't see a point and I really dislike not implementing it, we need to implement it as an error.

@seberg
Copy link
Contributor Author

seberg commented Nov 17, 2022

OK, I am unsure that it is very useful to support it, but it seems OK.

There is one broadcasting corner case that needs to be decided on in NumPy (which must have been part of the reason why I was unsure this is actually useful):

>>> arr = np.array([1., 2.])
>>> arr @ arr
5.0
>>> arr @= arr
>>> arr
array([5., 5.])

is a bit strange. But at least in-place behavior may be useful for stacks of small matrices (in some future).

@asmeurer
Copy link
Member

Can you explain what's going on in your example? How do you end up with [5., 5.]? In NumPy this gives an error so is this an example you made up or is this something that used to work?

@asmeurer
Copy link
Member

Note that the PEP only mentions broadcasting for >2 dimensions, and we also say the same in the spec (because contracted dimensions should never broadcast).

@jakirkham
Copy link
Member

It's an example if NumPy were to implement this (doesn't work today).

This demonstrates the behavior that it would have

import numpy as np

a = np.array([1., 2.])

a1 = a.copy()
np.matmul(a1, a1, out=a1)

The issue is the scalar result gets broadcast and assigned to the whole array.

In any event we are discussing potentially raising an error in this case (as it is odd behavior).

@asmeurer
Copy link
Member

Oh I understand now. I never realized that out allows "double broadcasting" like this. matmul is special because it disallows broadcasting in the last two dimensions. I don't know if it's possible in the ufunc machinery to do it, but IMO it should be an error to let the broadcasting to out.shape apply to the last two dimensions for matmul (and any other function like it that has non-broadcasting dimensions).

In general, contracted dimensions shouldn't broadcast, and to me it makes sense to extend this to the "out" broadcasting that happens in the ufunc (and especially so if that ends up being the semantics for @=).

The spec says:

An in-place operation must not change the data type or shape of the in-place array as a result of Type Promotion Rules or Broadcasting.

An in-place operation must have the same behavior (including special cases) as its respective binary (i.e., two operand, non-assignment) operation. For example, after in-place addition x1 += x2, the modified array x1 must always equal the result of the equivalent binary arithmetic operation x1 = x1 + x2.

So it would already disallow this behavior. But it's probably worth adding a note clarifying what is required for __imatmul__ since it is different from the rest of the operators.

@rgommers rgommers changed the title Should __imatmul__ be really be enforced? Should implementing __imatmul__ really be enforced? Mar 8, 2023
@kgryte kgryte added this to the v2023 milestone Jun 29, 2023
@kgryte kgryte modified the milestones: v2023, v2024 Feb 19, 2024
@kgryte kgryte modified the milestones: v2024, v2025 Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: Linear Algebra Linear algebra.
Projects
None yet
Development

No branches or pull requests

5 participants