Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reductions over one-element collections can return surprising values #34380

Open
mbauman opened this issue Jan 14, 2020 · 4 comments
Open

Reductions over one-element collections can return surprising values #34380

mbauman opened this issue Jan 14, 2020 · 4 comments
Labels
arrays [a, r, r, a, y, s] collections Data structures holding multiple items, e.g. sets fold sum, maximum, reduce, foldl, etc.

Comments

@mbauman
Copy link
Member

mbauman commented Jan 14, 2020

Since reduce (and friends) are specified to call a two-argument function, the choice of what to do (and what to return) when there's only one element is rather arbitrary and can be surprising. A great motivating case here is reduce(vcat, X):

julia> reduce(vcat, [1])
1

julia> reduce(vcat, [1,2])
2-element Array{Int64,1}:
 1
 2

I really want reduce to just call vcat(1), but of course it cannot blindly do that in general. The solution is to provide an init. We do patch up this return value in a few notable cases — as this was required to get reduce to be type stable.

julia/base/reduce.jl

Lines 348 to 369 in 0ee3264

"""
Base.reduce_first(op, x)
The value to be returned when calling [`reduce`](@ref), [`foldl`](@ref`) or
[`foldr`](@ref) with reduction `op` over an iterator which contains a single element
`x`. This value may also used to initialise the recursion, so that `reduce(op, [x, y])`
may call `op(reduce_first(op, x), y)`.
The default is `x` for most types. The main purpose is to ensure type stability, so
additional methods should only be defined for cases where `op` gives a result with
different types than its inputs.
"""
reduce_first(op, x) = x
reduce_first(::typeof(+), x::Bool) = Int(x)
reduce_first(::typeof(*), x::AbstractChar) = string(x)
reduce_first(::typeof(add_sum), x) = reduce_first(+, x)
reduce_first(::typeof(add_sum), x::SmallSigned) = Int(x)
reduce_first(::typeof(add_sum), x::SmallUnsigned) = UInt(x)
reduce_first(::typeof(mul_prod), x) = reduce_first(*, x)
reduce_first(::typeof(mul_prod), x::SmallSigned) = Int(x)
reduce_first(::typeof(mul_prod), x::SmallUnsigned) = UInt(x)

This function is nicely documented but doesn't appear in the manual. Should we just patch up vcat and hcat here for now? Or perhaps reduceing over 1-element collections without an init value should be just as much of an error as reduceing over empty ones.

@mbauman
Copy link
Member Author

mbauman commented Jan 14, 2020

Should we just patch up vcat and hcat here for now?

Ah, if that's the outcome, then this would be an exact duplicate of #31169 — but that issue doesn't really talk about the general behavior.

@tkf
Copy link
Member

tkf commented Jan 15, 2020

Or perhaps reduceing over 1-element collections without an init value should be just as much of an error as reduceing over empty ones.

I think the problem is that vcat over Number is not monoid; i.e., there is no element e such that vcat(1, e) == vcat(e, 1) == 1. Maybe the solution is to think in terms of different equality ≃(x, y) = vcat(x) == vcat(y). Then calling vcat in reduce_first does make sense.

@tkf
Copy link
Member

tkf commented Jan 15, 2020

Or just leave it as-is if #34395 is going to happen?

@nsajko nsajko added arrays [a, r, r, a, y, s] collections Data structures holding multiple items, e.g. sets fold sum, maximum, reduce, foldl, etc. labels Jul 15, 2024
@thchr
Copy link
Contributor

thchr commented Sep 24, 2024

For the record - and surely obvious to the previous posters, but leaving for context for anyone searching - it's not just vcat but really any type which is not a monoid over a considered operation. One example is sum over a type X where +(::X, ::X) does not return an X, e.g.:

struct XInt
   v :: Int
end
Base.:+(a::XInt, b::XInt) = a.v + b.v # returns an Int, not an XInt - so not type-closed under addition
Base.zero(::XInt) = XInt(0) # just including this to show that this definition does not change the outcome

vs = XInt.(1:5)
sum(vs[1:1]) # returns an XInt
sum(vs[1:2]) # returns an Int

This can be fixed by defining Base.reduce_first(::typeof(+), v::XInt) = v.v - but since Base.reduce_first is not exported or documented, it seems this is not really a long-term solution.
I'm keen to know if there is a better solution though (aside from ensuring that ones types are closed under addition, which is not always possible or desirable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] collections Data structures holding multiple items, e.g. sets fold sum, maximum, reduce, foldl, etc.
Projects
None yet
Development

No branches or pull requests

4 participants