Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array+Scalar elementwise operations #6417

Closed
goszlanyi opened this issue Apr 4, 2014 · 35 comments
Closed

Array+Scalar elementwise operations #6417

goszlanyi opened this issue Apr 4, 2014 · 35 comments

Comments

@goszlanyi
Copy link

Recently (when? prerelease+2486?),
Array+Scalar and Array-Scalar operations were deprecated,
and one has to use the .+ and .- operators instead.

I hate this change and see it only as visual clutter.
It is also inconsistent, as one can still happily use
the Array*Scalar and Array/Scalar operations.

Why is one forbidden while the other one is allowed?

julia> [1,2,3]+10
WARNING: A::Array + x::Number is deprecated, use A .+ x instead.
 in depwarn at deprecated.jl:36
3-element Array{Int32,1}:
 11
 12
 13
julia> [1,2,3]/10
3-element Array{Float64,1}:
 0.1
 0.2
 0.3
@lindahua
Copy link
Contributor

lindahua commented Apr 4, 2014

There was a very long debate at JuliaLang/LinearAlgebra.jl#75, where we got to this consensus.

The decision was based on deep consideration about the consistency in both element-wise operations and algebraic computation.

@goszlanyi
Copy link
Author

I am shocked by that consensus, especially that in the discussion I have seen the natural argument:

*I guess "+, -, , / are elementwise when one argument is a scalar" is a reasonable and easy-to-remember rule.

@goszlanyi
Copy link
Author

Some users downloading only binaries will see the result of this decision only today.

@blakejohnson
Copy link
Contributor

I would agree that this is being unnecessarily pedantic about element-wise operators.

@nalimilan
Copy link
Member

More precisely, the arguments for the change were summarized here: JuliaLang/LinearAlgebra.jl#75

Maybe this is too strict after all. I don't think any concrete cases where the old behavior was a problem were found.

@johnmyleswhite
Copy link
Member

Let me put in a vote for loving this rule.

@jiahao
Copy link
Member

jiahao commented Apr 5, 2014

This goes back to the eternal question about whether Array{T,2} is a two-dimensional container of Ts or whether it is a Matrix imbued with linear algebraic operations. As the latter, only multiplication by a scalar makes sense (and division by a scalar as multiplying by its reciprocal): addition and subtraction do not. As the former, all four elementary operations make sense if one uses broadcasting semantics to perform the appropriate scalar operation on each array element.

My understanding of the current consensus is that we decided to make clear the distinction between broadcasting operators (.+, etc.) which are short forms for scalar operations on each element, and the ordinary operators which work on the scale of the Array itself as a single algebraic entity, and thus should support the semantics of linear algebra only.

@goszlanyi
Copy link
Author

@jiahao, Thank you for summarizing the arguments.

But can you give an example when the former, widespread, and no-surprise
use of Array op Scalar operations caused any problem for Julia?

@blakejohnson
Copy link
Contributor

matrix+scalar certainly doesn't have an obvious linear algebra meaning. So, the question is really are there behaviors we can adopt that increase usability? Our broadcasting behavior with matrix .+ vector also isn't a valid linear algebra operation. But the automatic broadcasting is hugely convenient and wonderful.

@kmsquire
Copy link
Member

kmsquire commented Apr 5, 2014

From @toivoh's comment in JuliaLang/LinearAlgebra.jl#75:

The argument that finally swayed Stefan was that then * (matrix multiplication) distributes over + (non-broadcasting addition), while .* distributes over .+. This should definitely lead to fewer surprises.

This seems to be the key argument for this behavior. To be perfectly clear,

(A+B)*x == A*x + B*x

when A and B are matrices and x is a vector, whereas

(A+1)*x != A*x + 1*x

if we the 1 is broadcast and added to each element of A.

(And similarly,

(A.+1).*B == A.*B .+ 1.*B

if we explicitly specify the broadcast operator.)

Of course, for the last, you don't need to use .+ to add the matrices, because they already have the same shape, so

A.*B .+ 1.*B == A.*B + 1.*B

@jiahao
Copy link
Member

jiahao commented Apr 5, 2014

The most troublesome example is a / B for scalar a and matrix B. As a broadcasting operation, a ./ B makes sense, but a / B is not well defined. Is it the broadcast operation or a * inv(B)? What if B is singular? And so on...

@jiahao
Copy link
Member

jiahao commented Apr 5, 2014

Similarly it is not clear if A + b is meant to be A .+ b, which would be functionally like A + b E (E being a matrix of all ones), or A + bI (I being the identity matrix).

@blakejohnson
Copy link
Contributor

@jihao I completely agree that division quickly leads to trouble. I disagree entirely that A+b should ever imply A+bI.

@blakejohnson
Copy link
Contributor

Also, parenthesis in programming languages (as opposed to mathematic expression) are often not movable without consequences. Our broadcasting behavior similarly breaks with some like: (A.*v)*u != A.*(v*u) where u and v are vectors and A is a matrix.

@goszlanyi
Copy link
Author

@jiahao The only problematic case is Scalar/Array, I agree.
(This is also handled by MATLAB separately.)
All other cases are natural as elementwise operations.

@kmsquire
Copy link
Member

kmsquire commented Apr 5, 2014

I disagree entirely that A+b should ever imply A+bI

I agree that it shouldn't. But if it did, then

(A+2)*x == Ax + 2I*x == Ax + 2x

which is quite nice, mathematically. Hence, the need to be clear about what A+2 actually means.

Since Julia is targeting a mathematical audience, I don't really have a problem with it being more mathematically consistent than other languages, even if that means the operators are slightly more pedantic.

I have vociferously argued the opposite view on other issues--that consistency with other languages or convenience was more important than mathematical consistency--when I first started using the language, but I've come to appreciate the subtle distinctions over time. But there is a need for a balance.

@kmsquire
Copy link
Member

kmsquire commented Apr 5, 2014

(A.*v)*u != A.*(v*u)

I don't think there is an expectation that this should be equivalent, since there's no mathematical relation between .* and *. But it is useful to expect that

(a+b)*c == ac + bc

for arbitrary a, b, and c.

@andreasnoack
Copy link
Member

All other cases are natural as elementwise operations.

I think this is a question about habit. I you are used to write Matrix+Number then it feels natural to you, but from a math point of view the operation feels unnatural.

But there is a need to have a balance.

Here I think it should be noted the burden is just a matter of typing a . to get the operation you want. I don't think that is a big price to pay for consistency.

@blakejohnson
Copy link
Contributor

I suppose I am fine with that so long as after matrix+scalar falls off the deprecation list, it becomes an error suggesting the .+ syntax. Otherwise, we will have a lot of confused new users.

@ivarne
Copy link
Member

ivarne commented Apr 5, 2014

@blakejohnson I totally agree: see d2d7c32

In my opinion the fundamental problem is that linear algebra called dibs on the operators without . (to be consistent with matlab), and lots of people don't think in linear algebra terms. When that choice has been made, I think we should try our best to keep users aware about the difference between the dotted operators and the undotted operators.

@timholy
Copy link
Member

timholy commented Apr 5, 2014

Overall I support the change. But it was a bit painful, in no small measure because our backtraces are often not as useful as one could hope. I suspect I am not alone in having spent a couple of hours over the last week chasing down missing .s in various test scripts.

But going forward it's no big deal.

@goszlanyi
Copy link
Author

I am still chasing the missing .s, and I still think it is not worth it.

@JeffBezanson
Copy link
Member

Although I find Gabor's quoting of my own argument highly incisive and compelling, I'm fine with the current behavior. Functions should mean something; they shouldn't just become elementwise because it is convenient or (in many systems) more efficient.

@goszlanyi
Copy link
Author

Jeff, I really did not mean to hurt anybody.
All I wanted to show that this change will come as a shock for average users,
but of course those who come later will more easily adapt to this change.
With all my depression, I am closing.

@StefanKarpinski
Copy link
Member

I don't think @JeffBezanson was offended at all, @GaborOszlanyi. I think we're all happy to have this kind of measured discussion about issues like this – keeping in mind that one may not always win.

@goszlanyi
Copy link
Author

Thanks Stefan.
Now that I have found all occurrences to be changed, I am less upset.
Also see #6438.

@StefanKarpinski
Copy link
Member

I actually think the fact that this is a somewhat painful change is indicative that this is important. The two senses of plus are different and code that doesn't distinguish them is, in some sense, fundamentally ambiguous.

@timholy
Copy link
Member

timholy commented Apr 6, 2014

While I support the .+ issue, I'm not sure I agree with that logic. Suppose we decided to change the parser so that indexing of arrays was done with A<7>. That would be a painful change, but it would also be senseless.

@johnmyleswhite
Copy link
Member

Couldn't you do A[...] -> A<...> using a simple search-and-replace?

@timholy
Copy link
Member

timholy commented Apr 6, 2014

No, because comprehensions would bite you. This issue is the same (and indeed slightly worse), because your editor can't reliably tell when a quantity is a scalar.

@ivarne
Copy link
Member

ivarne commented Apr 6, 2014

@timholy With that change you could write a simple parser that automatically fixed your codebase. With + and .+, you would have to actually parse and infer the types to do a somewhat automatic conversion. That makes it more painful, but also a more useful change, because you get better better less ambiguous code.

@StefanKarpinski
Copy link
Member

Changing a[k] to a<k> everywhere is only as hard as parsing. Indexing expressions and comprehensions can be distinguished purely syntactically – the parser does exactly that. The analogous change with indexing would be if we realized that there are really two very different senses of indexing into something and we really should be distinguishing them. Then you would have to pore through all uses of a[k] and decide which one was really going on – something which is inherently difficult, not just a matter of syntax – and if there were two different senses of indexing, I would argue, just as valuable.

@timholy
Copy link
Member

timholy commented Apr 6, 2014

if there were two different senses of indexing

To get away from artificial examples like A<...>, I would say there are: #5949. Perhaps this issue is an argument not to rely too heavily on multiple dispatch, and that there too we should use two different function names to distinguish slicing vs dimension-preserving indexing.

@StefanKarpinski
Copy link
Member

I actually just did this by accident and the new behavior annoyed / saved me – at first I was annoyed, then I realized that what I'd written was a mistake and the warning had saved prevented me from messing up. What I did was write a polynomial expression and then apply it to matrices:

julia> A = randn(3,3)
3x3 Array{Float64,2}:
  1.04366    0.724268   0.187964
  0.347349   1.24317    0.84459
 -1.14403   -0.295596  -1.46345

julia> A^2 - 2A + 1
WARNING: A::Array + x::Number is deprecated, use A .+ x instead.
 in + at deprecated.jl:26
3x3 Array{Float64,2}:
 0.0384445  1.15218     1.15688
 0.133397   0.0610497  -0.809932
 3.66563    0.827721    5.60388

Oops. Did I want to add a 3x3 matrix of ones at the end or did I want to add a 3x3 identity matrix? Good question. And that's why it's a good thing that this behavior will be undefined and force you to choose between these very different meanings:

julia> A^2 - 2A .+ 1
3x3 Array{Float64,2}:
 0.0384445  1.15218     1.15688
 0.133397   0.0610497  -0.809932
 3.66563    0.827721    5.60388

julia> A^2 - 2A + I
3x3 Array{Float64,2}:
  0.0384445   0.152184    0.156877
 -0.866603    0.0610497  -1.80993
  2.66563    -0.172279    5.60388

One thing I think we should do is allow UniformScaling operators to work with scalars so that an expression like x^2 - 2x + I can be applied to both scalar and matrix x generically.

@StefanKarpinski
Copy link
Member

See #6472, allowing polynomials using I for constant terms to apply to scalar and matrix arguments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests