-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mean of an empty collection #28777
Comments
same with btw |
I can fix it, but I am not sure what do we want. Should empty collections always throw an error, or they should produce Examples of current inconsistencies (following what was noted by @JeffreySarnoff ):
CC @nalimilan |
I agree that reductions over Relevant: #5234 |
here are some more for completeness: julia> y = Union{Missing, Float64}[missing, missing, missing]
julia> mean(skipmissing(y))
ERROR: ArgumentError: mean of empty collection undefined: Base.SkipMissing{Array{Union{Missing, Float64},1}}(Union{Missing, Float64}[missing, missing, missing])
julia> mean(Float64[])
NaN
julia> mean(Int64[])
NaN
julia> mean([])
ERROR: MethodError: no method matching zero(::Type{Any}) |
@gdkrmr: following what @nalimilan suggested currently we will fix I will keep this issue open for Julia 2.0 changes discussion and open a separate PR for the short term fix. |
In #29033 I have proposed the changes that are minimally breaking (because sometimes we will not throw an error now, but we did in the past) and consistent with the return value of |
After #29033, the mean of an empty tuple generates an inscrutable error: julia> Statistics.mean(())
ERROR: MethodError: Base.reduce_empty(::typeof(Base.add_sum), ::Type{Union{}}) is ambiguous. Candidates:
reduce_empty(::typeof(Base.add_sum), ::Type{T}) where T<:Union{UInt16, UInt32, UInt8} in Base at reduce.jl:236
reduce_empty(::typeof(Base.add_sum), ::Type{T}) where T<:Union{Int16, Int32, Int8} in Base at reduce.jl:235
Possible fix, define
reduce_empty(::typeof(Base.add_sum), ::Type{Union{}}) The previous error was better: julia> Statistics.mean(())
ERROR: ArgumentError: mean of empty collection undefined: () |
Good point, but I think it should be fixed in Base. See:
which is surprising
which is the same error. The core of the problem is
I think we should add |
Another one for the list:
|
quantile still has this bug. y = Union{Missing, Float64}[missing, missing, missing]
quantile(y)
julia> ERROR: MethodError: no method matching quantile(::Vector{Union{Missing, Float64}})
Closest candidates are:
quantile(::AbstractVector{V}, ::StatsBase.AbstractWeights{W, T} where T<:Real, ::AbstractVector{T} where T<:Real) where {V, W<:Real} at ~/.julia/packages/StatsBase/pJqvO/src/weights.jl:687
quantile(::AbstractVector, ::Any; sorted, alpha, beta) at /opt/julias/julia-1.7/share/julia/stdlib/v1.7/Statistics/src/Statistics.jl:1073
quantile(::Any, ::Any; sorted, alpha, beta) at /opt/julias/julia-1.7/share/julia/stdlib/v1.7/Statistics/src/Statistics.jl:1070
...
Stacktrace:
[1] top-level scope
@ REPL[33]:1 |
Hallo.
|
@ignace-computing: probably not, but this is a corner case that is just not consistently handled at the moment. IMO the reasonable solution would be throwing an error for all empty collections. Otherwise you just get insidious bugs (though |
It seems that this issue must have been discussed, but I could not find any reference to it so I open it.
mean
behaves inconsistently between array and collection implementations in empty container case. This is especially relevant sincemissing
was introduced into base. Here is a code that shows the problem:Maybe a specialized implementation of
mean
in case wheneltype
of the collection is specified should be implemented? This is the case ofskipmissing
which knows that itseltype
isFloat64
. For instancesum
handles this case correctly (there is a specialmapreduce
for this case withskipmissing
). Probably more functions inStatistics
should be reviewed, but first I would want to understand what is the intention withmean
.The text was updated successfully, but these errors were encountered: