-
-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C Elemwise implementation doesn't broadcast variables #335
Comments
The Python validation code (i.e. starting here) is checking the inputs' shapes against the broadcastable properties of their corresponding symbolic variables, If the former is the correct interpretation, then this whole thing should've failed much earlier when the underlying More specifically, Regardless, in this situation, we have concrete (i.e. ground) shape values for the inputs (because we have concrete NumPy inputs), so I do not understand the point of using the symbolic broadcast information in |
OK, so the this can be easily fixed for the Python implementation of This whole situation brings up some important aspects of Aesara/Theano's design. The problem we're seeing is really caused by our use of a
Here's an illustration: import aesara
import aesara.tensor as at
# The input is broadcastable in both dimensions, but the symbolic type isn't
>>> at.TensorType(aesara.config.floatX, [False, False]).filter(np.array([[1]]))
array([[1.]])
# The input is broadcastable in the first dimension, but the symbolic type isn't
>>> at.TensorType(aesara.config.floatX, [False, False]).filter(np.array([[1, 2]]))
array([[1., 2.]])
# The input is broadcastable in both dimensions, but the symbolic type is only
# broadcastable in the second
>>> at.TensorType(aesara.config.floatX, [False, True]).filter(np.array([[1]]))
array([[1.]])
# The input is not broadcastable in the second dimension, but the symbolic type
# is
>>> at.TensorType(aesara.config.floatX, [False, True]).filter(np.array([[1, 2]]))
...
TypeError: ('Non-unit value on shape on a broadcastable dimension.', (1, 2), (False, True)) In other words, when a symbolic type is broadcastable in a given dimension, the inputs must be as well, but, if the symbolic type isn't, then it doesn't matter if the inputs are broadcastable in that dimension or not. The problem with This is a pretty big design and/or implementation problem, and, apparently, it's been around for a while (at least since the last release of the original Theano). I'm going to say that this is yet another good reason to push for the changes in #312 sooner than later. Using Cython, we could fix this problem much more quickly. Plus, we could use the NumPy C API with minimal effort, and more rapidly iterate on comparisons between the vanilla CPython API and the NumPy API (or anything else, really). |
N.B.: While working on #336, I removed the input validation within the C implementation so that it would allow "unexpected" broadcastable inputs (i.e. the symbolic type says it's not broadcastable, but the input is). The result was an unbroadcasted version of the correct result. There may be a way to pre/post-broadcast the inputs/outputs of an We could use |
Or we add a |
I haven't looked into it, but my impression is that it might actually be difficult, because—as far as I've seen—none of the current C framework uses the NumPy C API in a clear-cut, high-level way. For instance, there are some aspects of the symbolic typing and input handling at the C-level that make me question whether or not the existing Python implementation is even being used correctly (e.g. I'm not sure whether or not views are actually being copied at some point). I (partially) take that back; there is an instance or two: e.g. here where the NumPy C API is used to broadcast in the relevant way. |
This appears to be another example of the same C implementation limitations: import numpy as np
import aesara
import aesara.tensor as at
x = at.scalar()
y = at.isclose(at.as_tensor([0, 0.5, 1, -1]), x)
y_fn = aesara.function([x], y)
aesara.dprint(y_fn, print_type=True)
# Elemwise{Composite{AND(LE(Abs((i0 - i1)), i2), Invert(OR(IsNan(i3), IsInf(i3))))}} [id A] <TensorType(bool, vector)> '' 3
# |TensorConstant{[ 0. 0.5.. 1. -1. ]} [id B] <TensorType(float64, vector)>
# |InplaceDimShuffle{x} [id C] <TensorType(float64, (True,))> '' 0
# | |<TensorType(float64, scalar)> [id D] <TensorType(float64, scalar)>
# |Elemwise{Composite{(i0 + (i1 * Abs(i2)))}} [id E] <TensorType(float64, (True,))> '' 2
# | |TensorConstant{(1,) of 1e-08} [id F] <TensorType(float64, (True,))>
# | |TensorConstant{(1,) of 1e-05} [id G] <TensorType(float64, (True,))>
# | |InplaceDimShuffle{x} [id C] <TensorType(float64, (True,))> '' 0
# |Rebroadcast{(0, False)} [id H] <TensorType(float64, vector)> '' 1
# |InplaceDimShuffle{x} [id C] <TensorType(float64, (True,))> '' 0
y_fn(0)
# ValueError: Input dimension mismatch. (input[0].shape[0] = 4, input[3].shape[0] = 1)
# Apply node that caused the error: Elemwise{Composite{AND(LE(Abs((i0 - i1)), i2), Invert(OR(IsNan(i3), IsInf(i3))))}}(TensorConstant{[ 0. 0.5.. 1. -1. ]}, InplaceDimShuffle{x}.0, Elemwise{Composite{(i0 + (i1 * Abs(i2)))}}.0, Rebroadcast{(0, False)}.0)
# Toposort index: 3
# Inputs types: [TensorType(float64, vector), TensorType(float64, (True,)), TensorType(float64, (True,)), TensorType(float64, vector)]
# Inputs shapes: [(4,), (1,), (1,), (1,)]
# Inputs strides: [(8,), (8,), (8,), (8,)]
# Inputs values: [array([ 0. , 0.5, 1. , -1. ]), array([0.]), array([1.e-08]), array([0.])]
# Outputs clients: [['output']] Manually broadcasting the arguments (e.g. using This arose as an issue during the implementation of aesara-devs/aeppl#110. Also, running this same example on the #711 branch produces the following graph (and no errors): Elemwise{Composite{AND(LE(Abs((i0 - i1)), i2), Invert(OR(i3, i4)))}} [id A] <TensorType(bool, (None,))> '' 4
|TensorConstant{[ 0. 0.5.. 1. -1. ]} [id B] <TensorType(float64, (4,))>
|InplaceDimShuffle{x} [id C] <TensorType(float64, (1,))> '' 0
| |<TensorType(float64, ())> [id D] <TensorType(float64, ())>
|Elemwise{Composite{(i0 + (i1 * Abs(i2)))}} [id E] <TensorType(float64, (1,))> '' 3
| |TensorConstant{(1,) of 1e-08} [id F] <TensorType(float64, (1,))>
| |TensorConstant{(1,) of 1e-05} [id G] <TensorType(float64, (1,))>
| |InplaceDimShuffle{x} [id C] <TensorType(float64, (1,))> '' 0
|Elemwise{isnan,no_inplace} [id H] <TensorType(bool, (1,))> '' 2
| |InplaceDimShuffle{x} [id C] <TensorType(float64, (1,))> '' 0
|Elemwise{isinf,no_inplace} [id I] <TensorType(bool, (1,))> '' 1
|InplaceDimShuffle{x} [id C] <TensorType(float64, (1,))> '' 0 I'm little curious as to why the new Update: Yeah, shape/broadcastables inference is the problem in this case; the check that produces the error is only added when the input type is not broadcastable (see here), so the |
I'm seeing a very weird error in
Elemwise
:First, here's a basic broadcasting operation in NumPy:
In Aesara, here's the equivalent operation using
TensorConstant
s:The resulting graph is a simple
Elemwise
for the subtractionOp
–as expected. There's also anInplaceDimShuffle
that adds a broadcastable dimension to the second argument, so that both inputs have the same number of dimensions. ThisInplaceDimShuffle
is equivalent tonp.expand_dims(m, 0)
, which–when subtracted fromx
–yields the same value asz
.So far, everything is good, because
Now, when we replace the
TensorConstant
s with genericTensorVariable
s, we get a strange error:We can emulate this issue using the Python implementation of
Elemwise
, as well:From this output, we can see that this erroneous error is apparently the result of some bad input validation code in
Elemwise
.The same is true for the C implementation, although that's a little less apparent from the output. In this case, the C code for this validation step is here.
The text was updated successfully, but these errors were encountered: