-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing and Missing #24653
Add missing and Missing #24653
Conversation
base/operators.jl
Outdated
@@ -155,7 +155,7 @@ julia> "foo" ≠ "foo" | |||
false | |||
``` | |||
""" | |||
!=(x, y) = !(x == y)::Bool | |||
!=(x, y) = !(x == y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine there was a reason for the presence of this type assertion, so how problematic is it to remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@test promote_type(Union{Int, Missing}, Union{Int, Missing}) == Union{Int, Missing} | ||
@test promote_type(Union{Float64, Missing}, Union{String, Missing}) == Any | ||
@test promote_type(Union{Float64, Missing}, Union{Int, Missing}) == Union{Float64, Missing} | ||
@test_broken promote_type(Union{Void, Missing, Int}, Float64) == Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand whether this bug is in this PR or reflects a deeper Julia bug. See JuliaData/Missings.jl#47.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeffBezanson , @vtjnash, it'd be great to have you take a look at this
base/missing.jl
Outdated
showerror(io::IO, ex::MissingException) = | ||
print(io, "MissingException: ", ex.msg) | ||
|
||
ismissing(::Any) = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it required here given it's already in essentials.jl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I just realized that I had missed the warning.
end | ||
|
||
@testset "any & all" begin | ||
@test any([true, missing]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the result of any(x)
/all(x)
be still inferred as Bool
when x
cannot contain missing values? (IMO it would be a nice thing to have)
If so, maybe it could be tested here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I've added @inferred
everywhere it makes sense to test/reduce.jl rather than to this file, since it's more logical to test the behavior where type stability is expected. Also, benchmarks should should be pretty good at catching this kind of issue.
20a8f13
to
b207fc8
Compare
I don't think this is ready to be merged into Base. In my mind we should hold off putting something into Base until we have found a solution that has no known serious problems with important packages in the data sphere. That is not the case for In particular, even after half a year of folks thinking about this, there is no strategy how I know there is a lot of enthusiasm for the For those that aren't familiar with the details of the problem, take a look at JuliaData/Missings.jl#6. @nalimilan and @quinnj What is your thinking about the Query.jl situation? Aren't we heading straight for a complete split data ecosystem with this strategy? Is that good? |
@davidanthoff Can we keep that discussion at JuliaData/Missings.jl#6, where it has happened from the beginning? Mixing the technical review with more general design issues regarding Query isn't going to give a good result. |
b207fc8
to
96eb33a
Compare
@@ -112,7 +112,7 @@ function reducedim_init(f, op::typeof(*), A::AbstractArray, region) | |||
end | |||
function _reducedim_init(f, op, fv, fop, A, region) | |||
T = promote_union(eltype(A)) | |||
if applicable(zero, T) | |||
if T !== Any && applicable(zero, T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one isn't totally satisfying, but I had to add it due to the zero(::Type{Any}) = throw(MethodError(zero, (Any,)))
method, which is in turn needed to replace a StackOverflowError
with a more explicit error. I wonder whether applicable
should treat methods that throw unconditionally as if they didn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether applicable should treat methods that throw unconditionally as if they didn't exist.
That would be good, although probably just the ones which throw a MethodError
(or some other special error, e.g. NotImplementedError
).
Sure. @ViralBShah had asked me to raise this issue here. Where do you want discussion about the overall merits of moving this into Base? Over at https://github.com/JuliaData/Roadmap.jl/issues/3? |
Why not. Or file an issue in JuliaLang. |
@nanosoldier |
Man, Nanosoldier is being weird. I just restarted the server, so @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Nanosoldier results are mostly noise. Times appear to have improved (!) for a lot of benchmarks, which doesn't seem plausible, and to have regressed for three However, the @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
The new run is more credible. Apart from a few regressions which are different from the last run and which don't seem related, there are two regressions and one improvement for Finally, the benchmarks for array equality haven't appeared in any of the two runs, so I think we can safely conclude there is no regression either. Overall we should be good to go. |
Who would be best to review this? Let's get it done soon so this doesn't get too conflicted. |
|
||
promote_rule(::Type{Missing}, ::Type{T}) where {T} = Union{T, Missing} | ||
promote_rule(::Type{Union{S,Missing}}, ::Type{T}) where {T,S} = Union{promote_type(T, S), Missing} | ||
promote_rule(::Type{Any}, ::Type{T}) where {T} = Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone have any qualms about this definition? Should it perhaps live in promotion.jl
because of it's generalness?
base/missing.jl
Outdated
@eval Math.$(f)(::Missing) = missing | ||
end | ||
|
||
zero(::Type{Union{T, Missing}}) where {T} = zero(T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should look back at why this definition was needed; I think it was so zeroes(Union{Int, Missing}, dims...)
worked, but I wonder if @Sacha0 's work on removing zeroes
will avoid the need for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's needed so that things like zero(eltype(Array{Union{Int, Missing}}))
work, which happens frequently when a function takes this kind of array as an argument.
return false | ||
end | ||
end | ||
return true | ||
return anymissing ? missing : true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use ifelse
here, similarly in reduce.jl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't make any difference for simple cases like this:
julia> using Missings
julia> f(x) = x ? missing : true
f (generic function with 1 method)
julia> g(x) = ifelse(x, missing, true)
g (generic function with 1 method)
julia> @code_llvm f(true)
; Function f
; Location: REPL[1]:1
define nonnull %jl_value_t addrspace(10)* @julia_f_63991(i8) {
top:
%1 = and i8 %0, 1
%2 = icmp eq i8 %1, 0
%. = select i1 %2, %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 139748733701392 to %jl_value_t*) to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 139748736339184 to %jl_value_t*) to %jl_value_t addrspace(10)*)
ret %jl_value_t addrspace(10)* %.
}
julia> @code_llvm g(true)
; Function g
; Location: REPL[2]:1
define nonnull %jl_value_t addrspace(10)* @julia_g_63993(i8) {
top:
%1 = and i8 %0, 1
%2 = icmp eq i8 %1, 0
%3 = select i1 %2, %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 139748733701392 to %jl_value_t*) to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 139748736339184 to %jl_value_t*) to %jl_value_t addrspace(10)*)
ret %jl_value_t addrspace(10)* %3
}
@test promote_type(Union{Int, Missing}, Union{Int, Missing}) == Union{Int, Missing} | ||
@test promote_type(Union{Float64, Missing}, Union{String, Missing}) == Any | ||
@test promote_type(Union{Float64, Missing}, Union{Int, Missing}) == Union{Float64, Missing} | ||
@test_broken promote_type(Union{Void, Missing, Int}, Float64) == Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeffBezanson , @vtjnash, it'd be great to have you take a look at this
value::Int | ||
end | ||
Base.zero(::Type{Unit}) = Unit(0) | ||
Base.one(::Type{Unit}) = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was best to not rely on the stdlib Dates
to make these tests work. If others feel differently I can change this.
test/missing.jl
Outdated
@test_throws MethodError zero(Any) | ||
@test_throws MethodError zero(String) | ||
@test_throws MethodError zero(Union{String, Missing}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would still be good to have AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will only apply to zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it throw the same error for one
and oneunit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one(String) == ""
because "foo" * "" == "foo"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thought we had decided against it. Anyway, you can use Symbol
instead of String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also have to add a special-case for the new methods in the ambiguous
test (which explains the failures).
Add basic support with special methods for operators and standard math functions. Adapt ==(::AbstractArray, ::AbstractArray), all() and any() to support three-valued logic. This requires defining missing and Missing early in the bootstrap process, but other missing-related code is included relatively late to be able to add methods to Base functions defined in various places.
I've fixed the tests and added a section of the manual detailing how I'll merge in a few days barring any objections. Then I'll open another PR to discuss adding missing-related functions to the public API. |
Add basic support with special methods for operators and standard math functions. Adapt ==(::AbstractArray, ::AbstractArray), all() and any() to support three-valued logic. This requires defining missing and Missing early in the bootstrap process, but other missing-related code is included relatively late to be able to add methods to Base functions defined in various places. Add new manual section about missing values.
Add basic support with special methods for operators and standard math functions. Adapt
==(::AbstractArray, ::AbstractArray)
,all()
andany()
to support three-valued logic. This requires defining missing and Missing early in the bootstrap process, but other missing-related code is included relatively lateto be able to add methods to Base functions defined in various places.
Hopefully the compiler will be able to optimize out the three-valued logic-related code for types which cannot contain
missing
. Benchmarks at JuliaCI/BaseBenchmarks.jl#138 should help checking that it's the case.Subsequent PRs will be needed to add convenience functions like
nonmissing
/skipmissing
andcoalesce
(see JuliaData/Missings.jl#53).