Implicit broadcast harmful? #135

baggepinnen · 2017-02-14T16:49:51Z

I'm posting this as an issue but intend it as a suggestion you might consider.

In Julia, operators such as +,- that work elementwise on matrices and vectors do not automatically broadcast when dimensions of the two operands are not the same, for this, the dot-syntax (e.g., .+) must be used. I find this explicit broadcast a nice safety feature.

Tensorflow operations, such as the ones above, use implicit broadcast, i.e., if dimensions do not match, broadcast is considered.

I'm proposing the following: In Tensorflow.jl, the dot syntax automatically calls the corresponding tensorflow operation, but the dot-free syntax throws an error if the sizes of the operands do not match.

Any thoughts on this? :)

The text was updated successfully, but these errors were encountered:

oxinabox · 2017-02-14T17:41:42Z

You are correct but
.+ and .* etc are all going to break in 0.6 in a way I don't (yet) see how can be fixed.
Because broadcasting will be automatic, due to getting dot-fusing everywhere.
Which will mean that .+ and .* will have their meanings automatically given by the language from the definitions of + and * on the elements of the Arrays.

But as the Tensors are not julia arrays, but representations of points in the computation graph,
I'm not sure that that will work out at all.
Unless some very clever stuff can be done.

It may be that mean that .+ and .* etc will need to be renamed, eg to ⊕ and ⊗ etc.

malmaud · 2017-02-14T21:54:33Z

I think the best we can really do with the changes in .6 is to define the scalar-form operators (+, -, etc).

Longer-term, I'm thinking of having a macro like @tensorflow where all the code in its block is interpreted in a special way appropriate for making a computation graph, which can include special non-standard semantics for the scalar and broadcasting operators on tensors.

oxinabox · 2017-05-16T07:22:07Z

cf FluxML/Flux.jl#31

stevengj · 2017-05-25T15:48:00Z

Why couldn't you solve this by defining broadcast(f::Function, ...) operations on TensorFlow containers, or mixtures of those containers and scalars?

oxinabox · 2017-05-25T16:28:37Z

So rather the defining

TensorFlow.jl/src/ops/math.jl

Lines 116 to 118 in 0717482

    
           Base.broadcast(::typeof(*), n1::AbstractTensor, n2::AbstractTensor) = multiply(n1, n2) 
        
           Base.broadcast(::typeof(*), n1::AbstractTensor, n2) = multiply(n1, tf_promote(n1, n2)) 
        
           Base.broadcast(::typeof(*), n1, n2::AbstractTensor) = multiply(tf_promote(n2, n1), n2)

TensorFlow.jl/src/ops/comparison.jl

Lines 51 to 53 in 5be0725

    
           @eval Base.broadcast(::typeof($sym), t1::AbstractTensor, t2::AbstractTensor) = $func(t1, t2) 
        
           @eval Base.broadcast(::typeof($sym), t1::AbstractTensor, t2) = $func(t1, Tensor(t2)) 
        
           @eval Base.broadcast(::typeof($sym), t1, t2::AbstractTensor) = $func(Tensor(t1), t2)

just define broadcast once for all functions.

as

Base.broadcast(func::Function, t1::AbstractTensor, ts...; kwargs...) = func(t1, Tensor.(ts)...; kwargs...)
Base.broadcast(func::Function, t1::AbstractTensor, t2::AbstractTensor, ts...; kwargs...) = func(t1, t2, Tensor.(ts)...; kwargs...)
Base.broadcast(func::Function, t1, t2::AbstractTensor, ts...; kwargs...) = func(Tensor(t1), t2, Tensor.(ts)...; kwargs...)

malmaud · 2017-05-25T16:31:47Z

I'm not sure that's quite what we want...we need x*y to call Ops.matmul(x,y) and x.*y to call Ops.mul(x,y).

stevengj · 2017-05-25T16:37:07Z

Can't you define a generic broadcast by defining a custom op?

Basically, .* means fusing broadcast now in Julia — when the user writes @. x += x^2 - 3*y + f(y) they get a syntactic guarantee that it is a single broadcast! call — so if you want something different you should use a different operator (e.g. ⊛) or function name.

malmaud · 2017-05-25T16:47:08Z

Not quite sure I see how a custom TensorFlow op helps with the situation - there is already one TensorFlow op for matrix multiplication and a TensorFlow op for element-wise multiplication, it's just a matter of the syntax for exposing them to users.

Users probably expect x.*y to translate to element-wise multiplication whether x happens to a Julia Array or a TensorFlow Tensor, but if it's not possible, we'll just have to educate users about the situation with a note in the docs. It was a design goal of TensorFlow.jl to, as much as possible, let users be agnostic of the distinction between arrays and tensors, so it just feels a little unfortunate.

oxinabox · 2017-05-25T16:52:39Z

It is going to break this code

Which is sad, because that is some beautiful duck-typing,
where the function can be passed, t and u as Arrays and it will compute the function directly.
Or the function can be passed Tensors, and it will return a node in the computational graph, that will be executed when you call run(sess, ...)

Being able to duck-type Tensors as Arrays is one of the most wonderful features.
Because functions can be written without thinking of Tensorflow at all, and then just work with it.

malmaud · 2017-05-25T16:55:25Z

I guess a workaround for now is to introduce an operator like ⊛ that is element-wise multiplication for both tensors and arrays and use that in TensorFlowDiffEq.

stevengj · 2017-05-25T16:56:51Z

Not quite sure I see how a custom TensorFlow op helps with the situation

Because then you can define broadcast(f::Function, ...) on Tensors and it will work for any f (since f will translate into a custom op that calls back to Julia). Then you will get all of your "dot ops" for free, and @. (x + 3x^2) * y will become a single fused elementwise operation. (And it will work, and fuse, for both Tensors and Arrays.)

oxinabox · 2017-12-12T03:22:56Z

Closing as non-actionable/already fixed

oxinabox mentioned this issue May 25, 2017

x.*x not customizable JuliaLang/julia#22053

Closed

oxinabox closed this as completed Dec 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit broadcast harmful? #135

Implicit broadcast harmful? #135

baggepinnen commented Feb 14, 2017 •

edited

Loading

oxinabox commented Feb 14, 2017

malmaud commented Feb 14, 2017

oxinabox commented May 16, 2017

stevengj commented May 25, 2017

oxinabox commented May 25, 2017

malmaud commented May 25, 2017

stevengj commented May 25, 2017 •

edited

Loading

malmaud commented May 25, 2017

oxinabox commented May 25, 2017

malmaud commented May 25, 2017

stevengj commented May 25, 2017 •

edited

Loading

oxinabox commented Dec 12, 2017

Implicit broadcast harmful? #135

Implicit broadcast harmful? #135

Comments

baggepinnen commented Feb 14, 2017 • edited Loading

oxinabox commented Feb 14, 2017

malmaud commented Feb 14, 2017

oxinabox commented May 16, 2017

stevengj commented May 25, 2017

oxinabox commented May 25, 2017

malmaud commented May 25, 2017

stevengj commented May 25, 2017 • edited Loading

malmaud commented May 25, 2017

oxinabox commented May 25, 2017

malmaud commented May 25, 2017

stevengj commented May 25, 2017 • edited Loading

oxinabox commented Dec 12, 2017

baggepinnen commented Feb 14, 2017 •

edited

Loading

stevengj commented May 25, 2017 •

edited

Loading

stevengj commented May 25, 2017 •

edited

Loading