array reductions (sum, mean, etc.) and dropping dimensions #16606

ak168421 · 2016-05-27T00:46:43Z

Following this discussion on julia-users: https://groups.google.com/forum/#!topic/julia-users/V-meqUZtb7k

I find the fact that sum(A,1) and A[1,:] have different shapes in 0.5 counter intuitive. I am not aware of other languages for which the equivalent expressions have different shapes. It leads to unexpected results when computing e.g. A[1,:] ./ sum(A,1). At least in the case of sum, it seems like sum(A,1) should be equivalent to A[1,:] + A[2,:] + A[3,:] + ...

However, as @timholy pointed out, this behavior can be useful for broadcasting with expressions like A ./ sum(A,1).

For me, having the default slicing and array reduction behavior be consistent seems the natural choice, meaning that sum(A,1) should drop the first dimension. But I understand the attractiveness of not dropping dimensions for broadcasting. What are others' thoughts?

The text was updated successfully, but these errors were encountered:

BobPortmann · 2016-05-27T19:18:57Z

I agree with you. Perhaps the default behavior could be to drop (for consistency with A[1,:]) and have a keyword argument on reduction functions to keep dims.

StefanKarpinski · 2016-05-27T19:25:05Z

Having a keyword argument affect the type (i.e. number of dimensions) of the result would make this function type unstable.

andreasnoack · 2016-05-27T19:25:49Z

The way to keep the dimensions is to use a range, i.e. A[1:1,:].

ak168421 · 2016-05-27T19:28:07Z

I am aware that A[1:1,:] does not drop dimensions, but it seems clear that the "default" behavior now is dropping dimensions APL-style with A[1,:], and, intuitively, it seems that the "default" behavior of sum should be compatible.

andreasnoack · 2016-05-27T19:30:30Z

My comment was a reply to @BobPortmann's comment, not to the general issue.

mschauer · 2016-05-27T21:05:21Z

I think this was settled in #13612

Edit: "settled" is a bit too strong

BobPortmann · 2016-05-27T21:56:08Z

I am aware of the way A[1:1,:] vs A[1,:] works and think it is nice. My comment was about the reduction functions, which can be seen as inconsistent to slicing with a scalar. There was much discussion in #3262 but that was before the scalar dimensions dropping occurred. I thought perhaps that sentiment may have changed since then but it appears not.

timholy · 2016-08-29T14:26:23Z

#18271 is the strongest argument I've seen yet in favor of dropping dimensions from reductions. It's a pity we don't have an analog of A[1:1, :] to indicate that a dimension should be kept. What if we introduced NoDrop, and used mean(A, NoDrop(1)) and mean(A, NoDrop(1:2))?

Alternatively we could switch the reduction functions so that they required a boolean tuple for dims, i.e., mean(A, (true,false)) and mean(A, (true,true)) rather than mean(A, 1) and mean(A, 1:2). (true= "do reduce over the dimension in this slot".) Then we could also specify that the dimension be retained with mean(A, (true:true,false)) and mean(A, (true:true, true:true)), a direct analogy to the getindex syntax. I'm a little less excited about this variant, though.

JeffBezanson · 2017-01-04T21:32:35Z

This seems not-fully-decided, so moving to 1.0.

tknopp · 2017-02-03T19:58:57Z

+1 for making reductions and getindex consistent.

StefanKarpinski · 2017-03-07T04:47:52Z

Discussion on Discourse: https://discourse.julialang.org/t/sum-squeeze-as-one-function/2471

antoine-levitt · 2017-05-20T08:19:31Z

FWIW, just got bitten by this: I was expecting something like sum(A.*x',2) to be the same type as A*x, and it wasn't.

mbauman · 2017-05-30T15:56:30Z

Half-baked idea prompted by #22000 (comment): we could pass reduction functions through the indexing syntax in order to specify which dimensions get reduced. E.g., b[sum, :, sum] == sum(b, (1,3)). This would intuitively drop dimensions.

nalimilan · 2017-05-30T16:16:47Z

That's an appealing approach, which reminds me of the X_{i+} notation for denoting column sums.

StefanKarpinski · 2017-05-30T20:27:37Z

Huh, that's a pretty cool syntax idea.

rapus95 · 2017-07-09T18:42:58Z

i guess it's a dumb question, but why don't we differ the behaviour in the same way as we do for array indexing? sum(A, 3) will drop while sum(A, 3:3) won't drop the dimension

Edit: rethinking my idea showed me why: for indexing colon style is used per dimension while for sum the colon style is used across dimensions

Edit2: #16606 (comment) sounds indeed very clever

StefanKarpinski · 2017-07-20T19:03:23Z

I have proposed, elsewhere squeeze(f, A, n) = squeeze(f(A, n), n) so that you can write squeeze(sum, A, 2) instead of squeeze(sum(A, 2), 2).

This simple definition makes it easier to write reductions that drops the dimensions over which they reduce. Fixes #16606, addresses part of the root issue in #22000.

StefanKarpinski · 2017-10-12T20:14:01Z

Since we're unlikely to change this in 1.0 and adding squeeze is non-breaking, this is no longer release-blocking.

StefanKarpinski · 2019-08-13T12:44:22Z

The argument against the keyword form is not that keyword arguments are slow (when that applied, dims also wasn't a keyword, so I don't think that was every the problem), it's that you have to add that keyword to every single reducer everywhere. So we can do this in Base but any custom reducers out there will not have support for the same pattern. It's just weird to have to implement the same completely identical functionality over and over again, even though there's a completely composable solution that requires no code changes at all, just a new function.

Also, no one has addressed the type stability argument: changing return type based on a keyword argument is not generally a good idea.

nickrobinson251 · 2019-08-13T14:01:17Z

Yeah, I think the arguments for dropdims(f, args...; dims, kwargs...) are pretty good. I think @oxinabox was just being greedy (which is good!), and pointing out that we can currently use a do-block to sum

sum(A; dims=1) do x
    x > 0 ? log(x) : x
end

and in some ways in would be nice to be able to write (something like)

@dropdims sum(A; dims=1) do x
    x > 0 ? log(x) : x
end

rather than e.g.

dropdims(sum, x -> x > 0 ? log(x) : x, A, dims=1)

or

g(x) = x > 0 ? log(x) : x
dropdims(sum, g, A, dims=1)

But all are an approvement over the current

dropdims(
    sum(X; dims=1) do x
        x > 0 ? log(x) : x
    end,
    dims=1
)

tkf · 2019-08-13T19:51:52Z

In #32860 we are discussing possible syntaxes for reduce/foldl/foreach. One idea for supporting multidimensional reduce is to add a syntax (say)

a[., ., :]

which is lowered to

Axed{(1, 2)}(a)

where

struct Axed{dims, T}
    x::T
end

(probably it should be Axed <: AbstractArray)

This way,

dropdims(sum(a[., ., :]))

can implement a type-stable version of

dropdims(sum(a; dims=(1, 2)); dims=(1, 2))

mcabbott · 2019-08-14T04:25:48Z

The argument that adding drop=true to all methods would be a pain also applies to things like sum(A, squeeze=1) from #16606 (comment) , unfortunately.

Here's a quick attempt at a macro which allows sum(A; dims=2) do ... end, to try it out:

using MacroTools

macro dropdims(ex) # doesn't handle reduce(...; dims=..., init=...) etc.
    MacroTools.postwalk(ex) do x
        if @capture(x, red_(args__, dims=d_)) || @capture(x, red_(args__; dims=d_)) 
            :( dropdims($x; dims=$d) )
        elseif @capture(x, dropdims(red_(args__, dims=d1_); dims=d2_) do z_ body_ end) || 
               @capture(x, dropdims(red_(args__; dims=d1_); dims=d2_) do z_ body_ end)
            :( dropdims($red($z -> $body, $(args...); dims=$d1); dims=$d2) )
        else
            x
        end
    end |> esc
end

@dropdims begin
    a = sum(ones(3,7), dims=2)
    b = sum(10 .* randn(2,10); dims=2) do x                      
        trunc(Int, x)
    end
    (a isa Vector, b isa Vector) # (true, true)
end

tkf · 2019-08-17T01:53:18Z

Thinking about "dims syntax" #16606 (comment) a bit more, sum(a[., ., :]) should probably return a lazy object so that y .= sum(a[., ., :]) does not have to allocate anything. Note that copyto!(::Array{N}, ::Axed{dims, <:Array{M}}) where N === M - length(dims) can automatically drop dimensions.

mcabbott · 2019-08-17T04:16:14Z

If this a[., ., :] object is something like what eachslice makes, then I suspect reductions should always drop dimensions, like prod.(eachcol(x)) does now. Although more efficiently, without breaking y .= prod.(eachcol(x)). Notice that prod(eachcol(x)) is an error.

tkf · 2019-08-17T05:17:21Z

If this a[., ., :] object is something like what eachslice makes

I should have clarified that it is not the case. My model of Axed is just "an array paired with dims" and not an iterator-of-iterators. I'd even define simple forwarding methods like:

ndims(a::Axed) = ndims(a.x)
size(a::Axed) = size(a.x)
getindex(a::Axed, I...) = getindex(a.x, I...)

Axed acts as a type-stable way to mark dimensions that can be dropped when there is no ambiguity (i.e., N === M - length(dims)).

mcabbott · 2019-08-17T13:50:41Z

Ah, I didn't realise you were thinking of Axed as behaving just as label, while I was thinking of it being like Slices (views of the array). Both ways allow something like dropdims(sum(A, dims=(2,4)), dims=(2,4)) to be written more visually. Are you picturing also using it for other things, perhaps like this?

sum(p .* log.(p[:,.] ./ q)

Here log needs to act on the scalars of Axed, which under Slices you'd have to write more like sum((p .* log.(p ./ q))[:,.])

In the other thread I suggested that the Slices view might allow all sorts of generalised mapslices! operations to be done neatly:

@views b[:, .] .= f.(a[:, .])  # foreach((x,y) -> x .= f(y), eachcol(b), eachcol(a))
c[:, .] := f.(a[:, .])         # c = mapslices(f, a, dims=1)

tkf · 2019-08-18T05:53:54Z

I implemented a function ReduceDims.along to demonstrate the idea. It's somewhat inspired by bramtayl's JuliennedArrays.jl and @mcabbott's comments. The syntax is something like

sum(along(x, &, &, :))

which is equivalent to sum(x; dims=(1, 2)). [1] Here is a demo

julia> using ReduceDims

julia> x = ones(2, 3, 4);

julia> y = sum(along(x, &, :, :))  # lazy sum(x; dims=1)
mapreduce(identity, add_sum, along(::Array{Float64,3}, (1,)))

julia> copy(y) isa Array{Float64, 3}
true

julia> dropdims(y) isa Array{Float64, 2}  # type stable
true

julia> zeros(1, 3, 4) .= y;  # non-allocating

julia> zeros(3, 4) .= y;  # automatically drop dims

With #31020, you can also fuse it with broadcasting:

julia> using LazyArrays: @~  # need LazyArrays until #19198 is resolved

julia> y = sum(along(@~(x .+ 1), :, &, :))
mapreduce(identity, add_sum, along(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3},Tuple{Base.OneTo{Int64},Base.OneTo{Int64},Base.OneTo{Int64}},typeof(+),Tuple{Array{Float64,3},Int64}}, (2,)))

julia> zeros(1, 3, 4) .= y;

[1] There is also + to indicate "repeat previous marker" (like regexp):

along(x::Array{<:Any, 2}, &, +, :) === along(x, &, :)
along(x::Array{<:Any, 3}, &, +, :) === along(x, &, &, :)
along(x::Array{<:Any, 4}, &, +, :) === along(x, &, &, &, :)
along(x::Array{<:Any, 2}, :, +, &) === along(x, :, &)
along(x::Array{<:Any, 3}, :, +, &) === along(x, :, :, &)
along(x::Array{<:Any, 4}, :, +, &) === along(x, :, :, :, &)

tkf · 2019-08-18T05:55:48Z

@mcabbott I think something like sum((p .* log.(p ./ q))[:,.]) should work with dims-as-label approach. In fact y .= sum(along(@~(p .* log.(p ./ q)), :, &)) should already work.

The reason behind dims-as-label approach is that this also support non-dropping API as well. IIUC, your nested approach would drop dimensions unless you explicitly mark the non-dropping axes in the destination (as in@views b[:, .] .= f.(a[:, .]))?

But I agree that iterator-of-iterators approach is quite useful especially because it lets you use functions that are not aware of dims. I think it's useful to have both while sharing the "axes specifier" syntax. For example, it could be implemented using syntax like eachslice(x, &, &, :).

This simple definition makes it easier to write reductions that drops the dimensions over which they reduce. Fixes JuliaLang#16606, addresses part of the root issue in JuliaLang#22000.

AzamatB · 2020-11-02T21:23:50Z

Can we do something about this? Would be great if we can replace dropdims(sum(A; dims=1); dims=1) that I have all over my codebase with something less ugly.

oscardssmith · 2021-04-08T19:48:11Z

The better approach here is probably to use sum(eachslice(A,...)). In general, dims=... was probably a design mistake, and rather than working on supporting it more, we should work on making more composable methods to get the same functionality.

dorn-gerhard · 2021-07-13T08:07:49Z

from a performance side

sum(eachslice(A,...)) seems to be slow compared to
dropdims(sum(A,dims=3),dims=3))

or using

a function (found on [https://stackoverflow.com/questions/53076017/sum-elements-of-an-array-over-a-given-dimension]) like

function f(x::AbstractArray{<:Number, 3})
  nrows, ncols, n = size(x)
  result = zeros(eltype(x), nrows, ncols)
  for k in 1:n
    for j in 1:ncols
      for i in 1:nrows
        result[i,j] += x[i,j,k]
      end
    end
  end
  result
end

For A = rand(3,3,1000) I get using @time:

0.000104 seconds (1.00 k allocations: 156.219 KiB)
0.000025 seconds (15 allocations: 496 bytes)
0.000021 seconds (1 allocation: 160 bytes)

ParadaCarleton · 2022-12-03T22:30:47Z

0.000104 seconds (1.00 k allocations: 156.219 KiB)

0.000025 seconds (15 allocations: 496 bytes)

0.000021 seconds (1 allocation: 160 bytes)

eachslice is currently slow because it's type-unstable. However, #32310 fixes this, so sum(eachslice(...)) should be just as fast as the other two methods starting with Julia 1.9.

2.0 should probably remove mapslices and dims keyword arguments entirely, since they can be replaced by the more-consistent map(f, eachslice(...)).

mcabbott · 2022-12-03T23:41:53Z

sum(eachslice(...)) should be just as fast as the other two methods

Timing this:

julia> let A = rand(1000, 1000);
         B = @btime sum($A; dims=2)
         C = @btime sum(eachcol($A))  # adds whole slices, won't work for prod
         D = @btime map(sum, eachslice($A; dims=1, drop=false))
         B ≈ reshape(C,:,1) ≈ D
       end
  min 256.542 μs, mean 264.496 μs (5 allocations, 8.02 KiB)
  min 681.833 μs, mean 13.722 ms (1000 allocations, 7.74 MiB)
  min 1.453 ms, mean 1.464 ms (1 allocation, 7.94 KiB)
true

julia> VERSION
v"1.10.0-DEV.100"

ParadaCarleton · 2022-12-04T01:29:47Z

sum(eachslice(...)) should be just as fast as the other two methods

Timing this:

julia> let A = rand(1000, 1000);
         B = @btime sum($A; dims=2)
         C = @btime sum(eachcol($A))  # adds whole slices, won't work for prod
         D = @btime map(sum, eachslice($A; dims=1, drop=false))
         B ≈ reshape(C,:,1) ≈ D
       end
  min 256.542 μs, mean 264.496 μs (5 allocations, 8.02 KiB)
  min 681.833 μs, mean 13.722 ms (1000 allocations, 7.74 MiB)
  min 1.453 ms, mean 1.464 ms (1 allocation, 7.94 KiB)
true

julia> VERSION
v"1.10.0-DEV.100"

Hmm, that's very weird. The relative performance seems worse now that the type instability is fixed, despite using way fewer allocations?

mcabbott · 2022-12-04T03:49:07Z

(Why the quote?)

You'll see different timings on different sizes & dimensions. One issue my example highlights is that some ways of slicing are very cache-unfriendly.

You could overload say map(::typeof(sum), ::Slices) to magically use the fast path. One problem is this change of types:

julia> sum(CUDA.rand(10, 10), dims=1)
1×10 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 5.94849  5.52325  5.60857  5.23081  5.4453  3.76833  6.61417  2.87759  5.43322  4.84982

julia> map(sum, eachslice(rand(Float32, 10, 10), dims=2, drop=false))
1×10 Matrix{Float32}:
 5.65775  5.5501  5.40247  3.77431  4.48232  5.90293  6.42368  5.52671  5.20836  5.30931

Another is how this ought to be written:

julia> sum(ones(2, 7); dims=2, init=0.0+3im)
2×1 Matrix{ComplexF64}:
 7.0 + 3.0im
 7.0 + 3.0im

There is an issue a bit like this with the magic fast path for reduce(vcat, xs), where supplying init breaks the spell.

ParadaCarleton · 2022-12-04T18:07:16Z

We could make sum in particular go fast by overloading, but we want mapping slices to be fast regardless of the function.

Is this a problem with Slices having overhead that causes poor performance? Does map need to be specialized to work more efficiently with Slices? Or is it just that sum does some extra-special magic to go faster, and it wouldn't work for other functions? (Maybe that's just because it's a reducing function, in which case we should probably add something to the performance tips explaining reduce(x; dims=...) is faster than mapping over slices?) I'll have to benchmark more details later today.

JeffBezanson added this to the 0.6.0 milestone Aug 29, 2016

JeffBezanson mentioned this issue Aug 29, 2016

BoundsError when indexing an array with 2d bit array #18271

Closed

JeffBezanson modified the milestones: 1.0, 0.6.0 Jan 4, 2017

mbauman mentioned this issue Jan 13, 2017

Arraypocalypse Now and Then JuliaLang/LinearAlgebra.jl#255

Closed

27 tasks

mbauman mentioned this issue Feb 3, 2017

Reductions do not preserve type JuliaArrays/AxisArrays.jl#55

Closed

mbauman mentioned this issue May 30, 2017

Add squeeze(A) method to squeeze all singleton dimensions #22000

Closed

mbauman added the arrays [a, r, r, a, y, s] label May 30, 2017

StefanKarpinski added the needs decision A decision on this change is needed label Aug 24, 2017

mbauman added a commit that referenced this issue Aug 29, 2017

Add squeeze(f, A, dims) for reductions to drop dims

fc2eeee

This simple definition makes it easier to write reductions that drops the dimensions over which they reduce. Fixes #16606, addresses part of the root issue in #22000.

mbauman mentioned this issue Aug 29, 2017

Add squeeze(f, A, dims) for reductions to drop dims #23500

Closed

StefanKarpinski removed this from the 1.0 milestone Oct 12, 2017

aahaselgrove mentioned this issue Aug 13, 2019

Reducer behaviour not consistent over 1-dimensional arrays #32878

Closed

tkf mentioned this issue Aug 13, 2019

Julep: Syntax for reduction (overloadable splatting?) #32860

Open

This was referenced Aug 31, 2019

WIP: Add dropdims(f, args..; dims, kwargs..) for reductions to drop dims invenia/julia#4

Closed

Add dropdims(f, args..; dims, kwargs..) for reductions to drop dims #33130

Closed

tkf mentioned this issue Mar 29, 2020

Deprecate dims methods for iterator forms? #35292

Open

StefanKarpinski added the triage This should be discussed on a triage call label Nov 30, 2020

DilumAluthge removed this from the 1.x milestone Mar 13, 2022

brenhinkeller added breaking This change will break code and removed triage This should be discussed on a triage call labels Nov 16, 2022

ParadaCarleton mentioned this issue Dec 3, 2022

Deprecate mapslices #47793

Open

gaspardbb mentioned this issue Jun 2, 2023

Rules for eachslice with multiple dims JuliaDiff/ChainRules.jl#722

Open

array reductions (sum, mean, etc.) and dropping dimensions #16606

array reductions (sum, mean, etc.) and dropping dimensions #16606

Comments

ak168421 commented May 27, 2016 • edited Loading

BobPortmann commented May 27, 2016

StefanKarpinski commented May 27, 2016

andreasnoack commented May 27, 2016

ak168421 commented May 27, 2016

andreasnoack commented May 27, 2016

mschauer commented May 27, 2016 • edited Loading

BobPortmann commented May 27, 2016

timholy commented Aug 29, 2016

JeffBezanson commented Jan 4, 2017

tknopp commented Feb 3, 2017

StefanKarpinski commented Mar 7, 2017

antoine-levitt commented May 20, 2017 • edited by tkelman Loading

mbauman commented May 30, 2017 • edited Loading

nalimilan commented May 30, 2017

StefanKarpinski commented May 30, 2017

rapus95 commented Jul 9, 2017 • edited Loading

StefanKarpinski commented Jul 20, 2017

StefanKarpinski commented Oct 12, 2017

StefanKarpinski commented Aug 13, 2019 • edited Loading

nickrobinson251 commented Aug 13, 2019

tkf commented Aug 13, 2019

mcabbott commented Aug 14, 2019

tkf commented Aug 17, 2019

mcabbott commented Aug 17, 2019

tkf commented Aug 17, 2019 • edited Loading

mcabbott commented Aug 17, 2019

tkf commented Aug 18, 2019 • edited Loading

tkf commented Aug 18, 2019 • edited Loading

AzamatB commented Nov 2, 2020

oscardssmith commented Apr 8, 2021

dorn-gerhard commented Jul 13, 2021

ParadaCarleton commented Dec 3, 2022 • edited Loading

mcabbott commented Dec 3, 2022

ParadaCarleton commented Dec 4, 2022

mcabbott commented Dec 4, 2022

ParadaCarleton commented Dec 4, 2022 • edited Loading

ak168421 commented May 27, 2016 •

edited

Loading

mschauer commented May 27, 2016 •

edited

Loading

antoine-levitt commented May 20, 2017 •

edited by tkelman

Loading

mbauman commented May 30, 2017 •

edited

Loading

rapus95 commented Jul 9, 2017 •

edited

Loading

StefanKarpinski commented Aug 13, 2019 •

edited

Loading

tkf commented Aug 17, 2019 •

edited

Loading

tkf commented Aug 18, 2019 •

edited

Loading

tkf commented Aug 18, 2019 •

edited

Loading

ParadaCarleton commented Dec 3, 2022 •

edited

Loading

ParadaCarleton commented Dec 4, 2022 •

edited

Loading