broadcast should not drop zero-dimensional results to scalars #28866

ZeppLu · 2018-08-24T08:52:37Z

julia> map(typeof, (zeros(), ones(), fill(3.14)))
(Array{Float64,0}, Array{Float64,0}, Array{Float64,0})

julia> typeof(ones() .* fill(3.14))
Float64

julia> typeof(zeros() + ones())  # in fact not only after broadcast
Float64

Is it a bug, or a feature?

mbauman · 2018-08-24T15:08:36Z

This is an intentional feature. Zero-dimensional arrays act like scalars in broadcast.

ZeppLu · 2018-08-24T16:15:21Z

OK seems like it's a feature.

But I still wonder why typeof(zeros() + ones()) == Float64, note there's no broadcast. Intuitively speaking, the sum of two arrays should always be an array as well, shouldn't it?

mbauman · 2018-08-24T16:17:51Z

Oh my, you're right — I missed your third example there. That indeed is a bug.

mbauman · 2018-08-24T16:25:45Z

In short: we have long depended upon broadcasting to implement a number of array functions that implicitly work elementwise. We just add additional size checks. Unfortunately we now also need to add a zero-dimensional check, too.

Just around the definition of +, I can see that this also impacts -, conj, real, and imag.

ZeppLu · 2018-08-24T16:37:41Z

IMHO, we should make things consistent here, if ones() .+ ones() returns a scalar while ones() + ones() returns an array, people will get confused.

My suggestion is, broadcast SHOULD treat 0d arrays as scalars, but SHOULD NOT implicitly cast them to scalars in the final result.

mbauman · 2018-08-24T16:44:55Z

Yes, I'm in complete agreement. That's the bug, and it needs to be fixed.

Edit (a year later): I'm confused by my comment. I'm pretty sure I didn't mean to say what this says. I just wanted to fix the + behavior but keep broadcast the same.

andyferris · 2018-08-25T21:08:52Z

The casting of 0D arrays to scalars at the end of broadcast has been bothering me too.

yurivish · 2019-01-11T03:05:48Z

I just noticed this too while working to update the tests for https://github.com/mbauman/InvertedIndices.jl to 1.0.

Specifically,

    A = fill(1)
    @test A[Not(A.==1)] == []

fails with

Test threw exception
  Expression: A[Not(A .== 1)] == []
  ArgumentError: invalid index: true of type Bool

since fill(1) .== 1 returns true on 1.0.

This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.

bzinberg · 2019-05-23T19:04:54Z

#28866 (comment)

My suggestion is, broadcast SHOULD treat 0d arrays as scalars, but SHOULD NOT implicitly cast them to scalars in the final result.

IIUC, what you are calling "treating 0d arrays as scalars" is not an exception, it's just an application of the usual broadcasting rules, as () is broadcastable to any shape. If 0d arrays didn't behave this way under broadcast, that would be a special-case exception to the broadcast rules. So you definitely have my +1 on that 👍

mbauman · 2019-05-23T19:14:37Z

Here's the crux of the problem: What should 1 .+ 2 return? Is it 3 or is it fill(3)?

As far as broadcast is concerned, it's exactly the same as fill(1) .+ 2 and fill(1) .+ fill(2); 1 and fill(1) have the same axes.

bzinberg · 2019-05-23T19:26:09Z

IMO, the following chain of equivalences should hold:

1 .+ 2 === broadcast(+, 1, 2) === broadcast(+, convert(1), convert(2)) == fill(3)

I say this because the type signature broadcast should require the arguments to be arrays, so when the user gives the shorthand of an Int, it should be WAI that the int be implicitly converted to an array.

I don't see any principled reason for 1 .+ 2 to be 3.

mbauman · 2019-05-23T19:45:22Z

I don't see any principled reason for 1 .+ 2 to be 3.

I tend to agree; I have a strong distaste for behaviors like this on principle. The reasons were entirely practical. I initially tried to make fill(1) .+ 2 == fill(3) (#26212) and locally even tried changing the 1 .+ 2 behavior as well but it was far more breaking than I was able to stomach. There is lots of code that intentionally allows scalar- or array-like behaviors with broadcasting. There is also lots of code that unnecessarily applies broadcasting to scalars.

This is something we could reconsider as a breaking change for 2.0, but in order for it to be feasible I think we'd also need to allow 0-dimensional arrays to participate in the linear algebra of vectors and matrices. I'm also still not convinced that it's actually all that terrible in practice, despite my own distaste for it in theory.

For more details, check out #17318.

andyferris · 2019-05-23T20:20:08Z

What’s the relationship between broadcast and linear algebra?

mbauman · 2019-05-23T20:24:27Z

It's just that folks would see zero-dimensional arrays much more frequently; making them more capable and allowing them to do things like fill(2) * rand(3,3) would ameliorate some of the complaints I'm quite certain we'd receive.

bzinberg · 2019-05-23T20:26:15Z

I tend to agree; I have a strong distaste for behaviors like this on principle. The reasons were entirely practical.

I think for big software projects, it's best to have the guts of the language be meticulous about degenerate cases, and to have the convenient-but-slightly-unprincipled stuff be found only very close to the surface, so as not to muddy the ability to reason consistently about the software. I'd say in this bug we are reaching precisely the point at which the motivation for that thinking becomes apparent: because Julia handled zero-dimensional broadcasting inconsistently since the beginning, it's now much easier to make a band-aid fix to the internals that special-cases zero-dimensional arrays, than to rework the internals so that the special case is not necessary.

(This is totally a hindsight observation, and I don't think anyone necessarily did anything wrong in this regard during the development trajectory of Julia, since there were trade-offs to make at every step of the way.)

bzinberg · 2019-05-23T20:27:08Z

It's just that folks would see zero-dimensional arrays much more frequently; making them more capable and allowing them to do things like fill(2) * rand(3,3) would ameliorate some of the complaints I'm quite certain we'd receive.

What is the intended meaning of fill(2) * rand(3, 3)? (I understand .*, but not *)

mbauman · 2019-05-23T20:48:09Z

Yes, history and inertia is one part of the story. The other part is that it is practically quite useful and (I think) friendlier. Arithmetic on a zero-dimensional array is quite limited, but there would be ways of defining it to be a bit friendlier (like having fill(2) * A be defined to mean the same thing as 2*A).

Now, I'm much more sympathetic to cases like fill(1) .+ fill(2) or fill(1) .== 1 because there you are already starting with an Array{T,0}, and there is some good sense in preserving that. But in the case of 1 .+ 2 I'm more strongly opposed to wrapping its result in an Array{T,0}. You are starting with something that has axes and supports getindex and you end up with the same sort of thing that has axes and supports getindex. The result is also far more useful.

bzinberg · 2019-05-23T21:40:23Z

(like having fill(2) * A be defined to mean the same thing as 2*A)

My concern there is that the *(a::Array, b::Array) already means matrix product, and so defining fill(2) * A to be fill(2) .* A rather than matmul(fill(2), A) (which only makes sense if size(A)[end-1] == 0) is a redefinition of an existing concept that makes it more complicated due to the added special case.

One thought that comes to mind is that perhaps we could have an infix operator for the tensor product. It seems to me that confusion of what 2 * A means (either "scale the matrix" or "a matrix product that has mismatched dimensions") comes from not realizing that the first one is a tensor product, not a matrix product.

But in the case of 1 .+ 2 I'm more strongly opposed to wrapping its result in an Array{T,0}. You are starting with something that has axes and supports getindex and you end up with the same sort of thing that has axes and supports getindex. The result is also far more useful.

Hm. I would say that any dotted operator should always return an array. That's a consistent and easy-to-remember rule, and I think having that consistency could save headaches like this without burdening programmers much. What it might do is be slightly surprising to people who are used to math notation which often just uses juxtaposition for all the many notions of multiplication (scalar, matrix, function composition, tensor). But I think it's good that we don't carry that notational ambiguity into this language :)

And converting an Array{T, 0} to a T is easy and IMO familiar as well: just add [] to the end. I'd call it analogous to how in Python, if L = [x], I wouldn't expect L and x to behave the same way -- I would know to write L[0], and readers would understand that.

bzinberg · 2019-05-23T22:43:48Z

Now, I'm much more sympathetic to cases like fill(1) .+ fill(2) or fill(1) .== 1 because there you are already starting with an Array{T,0}, and there is some good sense in preserving that. But in the case of 1 .+ 2 I'm more strongly opposed to wrapping its result in an Array{T,0}. You are starting with something that has axes and supports getindex and you end up with the same sort of thing that has axes and supports getindex. The result is also far more useful.

I would describe the situation as this: There is a type U and a function

f(w::W, x::U, y::U)::U

(in our case f = broadcast). There is an implicit conversion from Float64 to U. We then write z = f(w, 1.0, 2.0). I think we don't have any reason to expect z to have all the same behaviors as Float64 unless there is also an implicit conversion from U to Float64. (Also, Array{Float64,0} does support axes and getindex?)

andyferris · 2019-05-23T22:51:19Z

Honestly, I never much liked that Number has container semantics (that seemed to be a purely pragmatic decision to ease certain parts of the implementation in Base but results in silly bugs where typos like for i in N proceed without error) and broadcast is an operation on containers. Basically I feel that 1 .+ 2 is a programmer error - I wouldn't let it slide on a PR for example.

In very generic code the one thing you do have to mentally track is which variables are scalar and which are containers, so I don't see any impediment to generic code if for example it were forbidden to write broadcast operations on scalars only (IMO such a choice should basically only catch logic errors or bugs). But returning a 0d array in the default case seems fairly a fairly pragmatic choice, and has logical consistency with the way scalars are treated within a larger broadcast operation.

Also note that if we support fill(2) * matrix we still probably shouldn't support something like matrix + fill(2) in the future (where I still believe matrix + 2 makes sense as matrix + 2I).

bzinberg · 2019-05-23T23:05:14Z

@andyferris

(where I still believe matrix + 2 makes sense as matrix + 2I)

This is specific to the rank-2 case, right?

mbauman · 2019-05-28T17:07:48Z

The best way to evaluate this change would be to try it out. Personally, I don't find the status quo so abhorrent, so I won't be pushing for this change, and clearly it'd be a breaking v2.0 change, but it's a really easy change to make. Just steal the logic from #32122 — it really shouldn't be more than 10LOC between all the builtin broadcast implementations. Then the big question is how many LOC worth of tests and packages need to change. I'm up for changing my mind on the practicality here if the evidence weighs against it!

mbauman · 2019-05-29T16:00:05Z

Here's a compelling argument: https://discourse.julialang.org/t/broadcasting-and-pairs-using/24739

In short:

p = "2".=>"two"
replace.(["123", "246"], p)

vs.

replace.(["123", "246"], "2".=>"two")

The first fails, whereas the second succeeds, but the two should be equivalent in their end result. This only happens because we've lost the 0-d container.

bzinberg · 2019-05-29T16:50:25Z

@mbauman, I'd be interested in trying your suggestion in #28866 (comment) to see how much work it is. I'm new to contributing to Julia -- can you point me to what I need to know to get my local clone to the point where I can make a change to the broadcast implementation, run some compile/test commands, and see what breaks? I can do that quickly, then over the next few weeks work on it as I have time.

mbauman · 2019-05-29T17:25:23Z

Here you go: https://github.com/JuliaLang/julia/compare/mb/true28866

Just:

git clone https://github.com/JuliaLang/julia.git
cd julia
git checkout mb/true28866
make
make testall

mbauman · 2019-06-10T14:25:38Z

Just an administrative note: this issue started out describing both broadcast's design as well as the bug in #32122 — and I wrote the commit message there before we really started hashing out broadcast's design here. So we should re-open this issue if that commit message ends up auto-closing this — it only addresses the "easy" half of this issue.

…tainers This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.

…tainers (#32122) This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.

…g 0d containers (#32122)" This reverts commit 38b38cf.

… dropping 0d containers (#32122)"" This reverts commit e6aa604.

JuliaLang/julia#28866

cossio · 2020-05-18T16:10:18Z

I find zeros() .+ 1 == 1 very unintuitive. I had an issue with this today. The following function:

f(x::AbstractArray; dims=:) = sum(x .+ 1; dims=dims)

Then f(x) works for all arrays, except for zero-dimensional arrays because of this. In this example I now have to special case dims=:.

Maybe there should be a definition for sum(::Number; dims::Colon).

GiggleLiu · 2022-06-01T17:47:02Z

Broadcasting should return 0-dimensional array! Support you with another using case:

julia> conj(Zygote.Fill(1.0im))
0-dimensional Array{FillArrays.Fill{ComplexF64, 0, Tuple{}}, 0}:
Fill(0.0 - 1.0im)

The conj over filled array returns a 0-dimensional array with filled array inside!

This is cause by:

first broadcast over Fill, returns a Fill type.
then wrap a 0-dimensional array to keep the return shape the same.

julia/base/broadcast.jl

Line 860 in 938da26

@inline function broadcast_preserving_zero_d(f, As...)

Is it really safe to treat all arrays as dense array type? The above code is very bad to ecosystem.

This now causing OMEinsum AD test failure:
under-Peter/OMEinsum.jl#141

mbauman · 2022-06-01T19:39:05Z

Ironically, that's happening because FillArray's Fill defines its own broadcast implementation that preserves the 0-dimensional array as requested here. I agree this is how things should behave, but changing this quite breaking (as you can see) and will likely have to wait for v2.0.

It'd work as you want if FillArrays would match Base's current broadcasting semantics.

GiggleLiu · 2022-06-01T20:22:32Z

Ironically, that's happening because FillArray's Fill defines its own broadcast implementation that preserves the 0-dimensional array as requested here. I agree this is how things should behave, but changing this quite breaking (as you can see) and will likely have to wait for v2.0.

It'd work as you want if FillArrays would match Base's current broadcasting semantics.

It is not only the problem of the FillArrays. This implementation in Julia base (

julia/base/broadcast.jl

Line 860 in 938da26

@inline function broadcast_preserving_zero_d(f, As...)

) is bad:

length(axes(bc)) == 0 ? fill!(similar(bc, typeof(r)), r) : r

Two aspects can be improved

it should check the type of the original array and try to return the same array type,
it should be type stable.

I won't be happy if conj(::Fill) returns me a normal array.
Maybe fallback conj, real function on the broadcast method is a wrong decision, because broadcast is not bijective in type (different array types can go to the same scalar type).

mbauman · 2022-06-01T21:21:19Z

it should check the type of the original array and try to return the same array type,

It tries to; that's what the similar(bc, typeof(r)) part does. It's up to the broadcast implementation to specialize that.

it should be type stable.

It is not a type instability because the length of the axes is encoded into the broadcast's type.

GiggleLiu · 2022-06-01T21:36:50Z

it should check the type of the original array and try to return the same array type,

It tries to; that's what the similar(bc, typeof(r)) part does. It's up to the broadcast implementation to specialize that.

it should be type stable.

It is not a type instability because the length of the axes is encoded into the broadcast's type.

I see what you mean. So it is a wield interaction of similar does not preserve type and broadcast does not keep array type information for 0-dimensional array. Now I agree there is no easy solution. Hope Julia v2.0 can make the broadcast more predictable.

kellertuer · 2023-04-10T18:08:34Z

Hi,
I just sumbled upon this as well and just to add to the irritation here, I wrote a small discourse about my encounter https://discourse.julialang.org/t/broadcast-strange-on-0-dimensional-array/97311 - which I will resolve by not using 0-dim arrays, since for me they currently seem to not behave as I expect arrays to behave (especially in the .+ case.

sadish-d · 2023-06-26T17:12:34Z

I came to realize the broadcasting behavior on 0-dimensional arrays pretty late. This will mean a significant implementation change for what I'm working on. A minimal example of the issue I'm running into:

f(x::Array{<:Real}) = x .+ 1
g(y::Array{<:Real}) = nothing

a = [1]
b = fill(1)

(g∘f)(a) # works
(g∘f)(b) # breaks

MilesCranmer · 2024-06-27T15:55:28Z

This is especially problematic for GPU code because scalarization strips any allocation information present in a zero-dimensional array:

julia> x = MtlArray(fill(1f0))
0-dimensional MtlArray{Float32, 0, Private}:
1.0

julia> x .+ x
2.0f0

which now lives on the CPU.

mbauman changed the title ~~0-dimensional array becomes scalar after broadcast~~ Some array functions drop zero-dimensional results to scalars Aug 24, 2018

mbauman added bug Indicates an unexpected problem or unintended behavior broadcast Applying a function over a collection arrays [a, r, r, a, y, s] labels Aug 24, 2018

mbauman mentioned this issue Dec 19, 2018

Inconsistent dot vectorization? #30450

Closed

mbauman mentioned this issue Jan 24, 2019

inconsistent zero-dimensional array operations #30825

Closed

mbauman mentioned this issue May 23, 2019

The sum of two Array{T, 0}s should be an Array{T, 0} #32115

Closed

mbauman changed the title ~~Some array functions drop zero-dimensional results to scalars~~ Should broadcast drop zero-dimensional results to scalars? May 23, 2019

KristofferC added a commit that referenced this issue Aug 26, 2019

Revert "Address part of #28866, prevent array operations from droppin…

e6aa604

…g 0d containers (#32122)" This reverts commit 38b38cf.

KristofferC added a commit that referenced this issue Aug 26, 2019

Revert "Revert "Address part of #28866, prevent array operations from…

8c86c86

… dropping 0d containers (#32122)"" This reverts commit e6aa604.

mauro3 added a commit to mauro3/WhereTheWaterFlows.jl that referenced this issue Oct 30, 2019

Bugfix which showed only up in 1.3

d66d86a

JuliaLang/julia#28866

mbauman added this to the 2.0 milestone May 18, 2020

bzinberg mentioned this issue May 19, 2020

Simplify ChoiceMap interface/architecture via ValueChoiceMaps probcomp/Gen.jl#263

Draft

mcabbott mentioned this issue May 9, 2021

real(::Fill) gives an array of arrays? JuliaArrays/FillArrays.jl#145

Open

bzinberg mentioned this issue Jul 14, 2021

Fix logpdf_grad for BroadcastedNormal. probcomp/Gen.jl#433

Merged

mbauman mentioned this issue Jul 19, 2021

broadcast of 0-dimensional array should return 0-dimensional array, not scalar #41643

Closed

mbauman changed the title ~~Should broadcast drop zero-dimensional results to scalars?~~ broadcast should not drop zero-dimensional results to scalars Jun 1, 2022

kellertuer mentioned this issue Apr 10, 2023

Nicer Interfaces for ::Number Manifolds and Objectives JuliaManifolds/Manopt.jl#236

Merged

21 tasks

nic-barbara mentioned this issue Jul 12, 2023

Error taking gradients through REN with nx = 0 and/or nv = 0 acfr/RobustNeuralNetworks.jl#99

Closed

This was referenced Jan 3, 2024

Preserve 0d arrays in Zygote.accum FluxML/Zygote.jl#1491

Merged

error in summation of product iterator FluxML/Zygote.jl#1490

Closed

mbauman mentioned this issue Jun 27, 2024

Array{T}(::T) should return a zero-dimensional array #54935

Closed

broadcast should not drop zero-dimensional results to scalars #28866

broadcast should not drop zero-dimensional results to scalars #28866

Comments

ZeppLu commented Aug 24, 2018

mbauman commented Aug 24, 2018

ZeppLu commented Aug 24, 2018

mbauman commented Aug 24, 2018

mbauman commented Aug 24, 2018

ZeppLu commented Aug 24, 2018 • edited Loading

mbauman commented Aug 24, 2018 • edited Loading

andyferris commented Aug 25, 2018

yurivish commented Jan 11, 2019 • edited Loading

bzinberg commented May 23, 2019 • edited Loading

mbauman commented May 23, 2019

bzinberg commented May 23, 2019

mbauman commented May 23, 2019 • edited Loading

andyferris commented May 23, 2019

mbauman commented May 23, 2019

bzinberg commented May 23, 2019 • edited Loading

bzinberg commented May 23, 2019 • edited Loading

mbauman commented May 23, 2019

bzinberg commented May 23, 2019 • edited Loading

bzinberg commented May 23, 2019

andyferris commented May 23, 2019

bzinberg commented May 23, 2019 • edited Loading

mbauman commented May 28, 2019 • edited Loading

mbauman commented May 29, 2019

bzinberg commented May 29, 2019

mbauman commented May 29, 2019

mbauman commented Jun 10, 2019

cossio commented May 18, 2020 • edited Loading

GiggleLiu commented Jun 1, 2022 • edited Loading

mbauman commented Jun 1, 2022

GiggleLiu commented Jun 1, 2022 • edited Loading

mbauman commented Jun 1, 2022 • edited Loading

GiggleLiu commented Jun 1, 2022

kellertuer commented Apr 10, 2023

sadish-d commented Jun 26, 2023

MilesCranmer commented Jun 27, 2024 • edited Loading

ZeppLu commented Aug 24, 2018 •

edited

Loading

mbauman commented Aug 24, 2018 •

edited

Loading

yurivish commented Jan 11, 2019 •

edited

Loading

bzinberg commented May 23, 2019 •

edited

Loading

mbauman commented May 23, 2019 •

edited

Loading

bzinberg commented May 23, 2019 •

edited

Loading

bzinberg commented May 23, 2019 •

edited

Loading

bzinberg commented May 23, 2019 •

edited

Loading

bzinberg commented May 23, 2019 •

edited

Loading

mbauman commented May 28, 2019 •

edited

Loading

cossio commented May 18, 2020 •

edited

Loading

GiggleLiu commented Jun 1, 2022 •

edited

Loading

GiggleLiu commented Jun 1, 2022 •

edited

Loading

mbauman commented Jun 1, 2022 •

edited

Loading

MilesCranmer commented Jun 27, 2024 •

edited

Loading