Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for missing in Plots #1706

Open
mkborregaard opened this issue Aug 28, 2018 · 23 comments
Open

Support for missing in Plots #1706

mkborregaard opened this issue Aug 28, 2018 · 23 comments

Comments

@mkborregaard
Copy link
Member

mkborregaard commented Aug 28, 2018

Plots doesn't yet support missing, which it should, since it's in Base.

The current approach taken in StatPlots is to copy the inputs, replacing missing with NaN for Vector{<:Number} (while converting to Float64), "" for String input and Symbol() for Symbol input, then propagating that. It should be possible to do so for the x, y and z arguments here

Plots.jl/src/series.jl

Lines 75 to 88 in 602dbdf

compute_x(x::Nothing, y::Nothing, z) = 1:size(z,1)
compute_x(x::Nothing, y, z) = 1:size(y,1)
compute_x(x::Function, y, z) = map(x, y)
compute_x(x, y, z) = copy(x)
# compute_y(x::Void, y::Function, z) = error()
compute_y(x::Nothing, y::Nothing, z) = 1:size(z,2)
compute_y(x, y::Function, z) = map(y, x)
compute_y(x, y, z) = copy(y)
compute_z(x, y, z::Function) = map(z, x, y)
compute_z(x, y, z::AbstractMatrix) = Surface(z)
compute_z(x, y, z::Nothing) = nothing
compute_z(x, y, z) = copy(z)
, as we already copy the input. It's more tricky for everything passed as keyword arguments, but the question is how often they'd contain missing.

Personally I think a cleaner approach is to provide first-class missings support in wrapping internal calls in Plots that may return missing in skipmissing where relevant and just forward the missing-containing Vectors to the backend and leave it to them to handle them. That makes sense since missing is defined in Base, and as such a first-class Julia citizen. But it's more work and we don't know how well the backends handle them.

@piever @daschw @SimonDanisch @pfitzseb @jheinen

@stephancb
Copy link

For the record, Float32 and NaN32 also work for PlotlyJS (but are then converted to double precision in the JSON, unfortunately).

@daschw
Copy link
Member

daschw commented Aug 28, 2018

I agree with you @mkborregaard that passing on missing to the backends would be the cleaner solution. However, neither GR nor PyPlot can handle missing currently:

julia> using PyPlot

julia> y = [rand(9); missing]
10-element Array{Union{Missing, Float64},1}:
 0.10054844974750021 
 0.5619977453885072  
 0.9352063939642001  
 0.013072160627248364
 0.4219567494193299  
 0.633474807303148   
 0.6790561066713039  
 0.03835857053635672 
 0.596949167563924   
  missing            

julia> plot(y)
ERROR: PyError ($(Expr(:escape, :(ccall(#= /home/dani/.julia/packages/PyCall/akNFy/src/pyfncall.jl:44 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'TypeError'>                                    
TypeError("float() argument must be a string or a number, not 'PyCall.jlwrap'")
  File "/usr/lib/python3.7/site-packages/matplotlib/pyplot.py", line 3363, in plot
    ret = ax.plot(*args, **kwargs)
  File "/usr/lib/python3.7/site-packages/matplotlib/__init__.py", line 1867, in inner
    return func(ax, *args, **kwargs)
  File "/usr/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 1529, in plot
    self.add_line(line)
  File "/usr/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 1960, in add_line
    self._update_line_limits(line)
  File "/usr/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 1982, in _update_line_limits
    path = line.get_path()
  File "/usr/lib/python3.7/site-packages/matplotlib/lines.py", line 956, in get_path
    self.recache()
  File "/usr/lib/python3.7/site-packages/matplotlib/lines.py", line 657, in recache
    y = _to_unmasked_float_array(yconv).ravel()
  File "/usr/lib/python3.7/site-packages/matplotlib/cbook/__init__.py", line 2052, in _to_unmasked_float_array
    return np.asarray(x, float)
  File "/usr/lib/python3.7/site-packages/numpy/core/numeric.py", line 501, in asarray
    return array(a, dtype, copy=False, order=order)

Stacktrace:
 [1] pyerr_check at /home/dani/.julia/packages/PyCall/akNFy/src/exception.jl:60 [inlined]
 [2] pyerr_check at /home/dani/.julia/packages/PyCall/akNFy/src/exception.jl:64 [inlined]
 [3] macro expansion at /home/dani/.julia/packages/PyCall/akNFy/src/exception.jl:84 [inlined]
 [4] __pycall!(::PyCall.PyObject, ::Ptr{PyCall.PyObject_struct}, ::PyCall.PyObject, ::Ptr{Nothing}) at /home/dani/.julia/packages/PyCall/akNFy/src/pyfncall.jl:44
 [5] _pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{Array{Union{Missing, Float64},1}}, ::Int64, ::Ptr{Nothing}) at /home/dani/.julia/packages/PyCall/akNFy/src/pyfncall.jl:22
 [6] #pycall#88 at /home/dani/.julia/packages/PyCall/akNFy/src/pyfncall.jl:11 [inlined]
 [7] pycall at /home/dani/.julia/packages/PyCall/akNFy/src/pyfncall.jl:86 [inlined]
 [8] #plot#85(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Union{Missing, Float64},1}) at /home/dani/.julia/packages/PyPlot/tA0wP/src/PyPlot.jl:179
 [9] plot(::Array{Union{Missing, Float64},1}) at /home/dani/.julia/packages/PyPlot/tA0wP/src/PyPlot.jl:176
 [10] top-level scope at none:0
julia> using GR

julia> plot([rand(9); missing])
ERROR: expected Real or Complex
Stacktrace:
 [1] #plot_args#6(::Symbol, ::Function, ::Tuple{Array{Union{Missing, Float64},1}}) at /home/dani/.julia/packages/GR/fnyt8/src/jlgr.jl:1038
 [2] plot_args at /home/dani/.julia/packages/GR/fnyt8/src/jlgr.jl:964 [inlined]
 [3] #plot#19(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Union{Missing, Float64},1}) at /home/dani/.julia/packages/GR/fnyt8/src/jlgr.jl:1123
 [4] plot at /home/dani/.julia/packages/GR/fnyt8/src/jlgr.jl:1118 [inlined]
 [5] #plot#2 at /home/dani/.julia/packages/GR/fnyt8/src/GR.jl:2973 [inlined]
 [6] plot(::Array{Union{Missing, Float64},1}) at /home/dani/.julia/packages/GR/fnyt8/src/GR.jl:2973
 [7] top-level scope at none:0

We could still start a transition, by wrapping calls in skipmissing in all Plots internal code and converting missing in the Plots backend code until the respective backends support it.

@abudis
Copy link

abudis commented Sep 7, 2018

I have just started learning Julia and this is something that it desperately needs in my opinion. Compare this to the simplicity of R:

aa <- c(1, 2, NA)
plot(aa)

@mkborregaard
Copy link
Member Author

sure.

@mkborregaard
Copy link
Member Author

There's a WIP PR here: #1731
But it doesn't quite work yet (two tests fail), and it's only for the input values, not the attributes. But it should mostly work.

@mkborregaard
Copy link
Member Author

Merged. @abudis try ]add Plots#master and check out how it works for you now.

@mkborregaard
Copy link
Member Author

@abudis just updating Plots should give you missing value support now, as we've tagged a release with the improvements. If you do use it a lot, it would be useful if you'd test it out and report back.

@abudis
Copy link

abudis commented Sep 11, 2018

@mkborregaard sorry I was offine for some time. Awesome! I'll test it asap! :D

@abudis
Copy link

abudis commented Sep 11, 2018

@mkborregaard works perfectly! Thanks for such a quick response!

@mkborregaard
Copy link
Member Author

Good to hear!

@mkborregaard
Copy link
Member Author

I'll keep the issue open, to remind us that this could be done more elegantly in the future.

@mkborregaard mkborregaard reopened this Sep 11, 2018
@cstjean
Copy link

cstjean commented Oct 10, 2018

NaNs don't mix well with DateTime inputs, unfortunately, so plot([DateTime(2017), DateTime(2018), missing], [1,2,3]) is still broken (#1465) and it causes vline issues (#1762), via

Plots.jl/src/recipes.jl

Lines 92 to 100 in bad5668

@recipe function f(::Type{Val{:vline}}, x, y, z)
n = length(y)
newx = vec(Float64[yi for i=1:3,yi=y])
newy = repeat(Float64[-1, 1, NaN], n)
x := newx
y := newy
seriestype := :straightline
()
end

@karajan9
Copy link

karajan9 commented Dec 1, 2018

missing support works well for me, unless the data starts off with one:
plot([0, missing, 2]) works as expected while
plot([missing, 1, 2]) won't:

ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] macro expansion at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:185 [inlined]
 [2] apply_recipe(::Dict{Symbol,Any}, ::Type{Missing}, ::Missing) at /home/karajan/.julia/packages/RecipesBase/Uz5AO/src/RecipesBase.jl:275
 [3] _apply_type_recipe(::Dict{Symbol,Any}, ::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:197
 [4] macro expansion at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:253 [inlined]
 [5] apply_recipe(::Dict{Symbol,Any}, ::Array{Missing,1}) at /home/karajan/.julia/packages/RecipesBase/Uz5AO/src/RecipesBase.jl:275
 [6] _process_userrecipes(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Tuple{Array{Missing,1}}) at /home/karajan/.julia/packages/Plots/rmogG/src/pipeline.jl:83
 [7] macro expansion at ./logging.jl:305 [inlined]
 [8] _plot!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Tuple{Array{Missing,1}})at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:171
 [9] #plot#132(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:57
 [10] plot(::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:51
 [11] top-level scope at none:0

Any idea what's going on there?
I'm using Julia 1.0.2 and Plots 0.21.0 with GR 0.36.0

@mkborregaard
Copy link
Member Author

No, that's strange.

@karajan9
Copy link

karajan9 commented Dec 2, 2018

Well... let me know if I can do anything to help resolve the problem, I'm pretty lost myself.

@ndinsmore
Copy link

I would like to propose that Nothing might be a more appropriate type to signify a break in the data. Looking at the documentation Missing has a very specific meaning. To me Nothing is much more equivalent to NaN and will not complicate the statistical interpretation of Missing. In a lot of my code I have been using the follow recipe construct:

@recipe function f(coords::AbstractArray{Union{LLA{S},T}} where S<:Real where T<:Nothing;coord_projection=(lla->(lla.lon,lla.lat)))

@karajan9
Copy link

I would like to propose that Nothing might be a more appropriate type to signify a break in the data. Looking at the documentation Missing has a very specific meaning. To me Nothing is much more equivalent to NaN and will not complicate the statistical interpretation of Missing. [...]

Hm, what part of the documentation do you mean? In the Missing Values section it reads

Julia provides support for representing missing values in the statistical sense, that is for situations where no value is available for a variable in an observation, but a valid value theoretically exists. Missing values are represented via the missing object [...]

In the FAQ of the documentation you have

Some functions are used only for their side effects, and do not need to return a value. In these cases, the convention is to return the value nothing, which is just a singleton object of type Nothing. This is an ordinary type with no fields; there is nothing special about it except for this convention [...]

So from reading that it does look to me like missing would indeed be the correct usage here. (And I think missing was created exactly to solve this issue of nothing meaning multiple different things, including missing data.)

NaN on the other hand is different to that: the value is known, it's just not a number. You might know that it's 0 / 0 but well, what are you going to do with that? So just call it NaN. NaN is not even defined for non-float types, so you can't really use it as a reference what you would want your missing/nothing to be like.

@ndinsmore
Copy link

ndinsmore commented Jan 26, 2019

@karajan9 , you will have to forgive me that I got slighly lost in the context of this discussion. Ie. just general support of passing missing to the backends. The point that I was intending to make is that incases where NaN would typically be used so signify not its mathmatical mean but a break in the data, Nothing might be another good way to represent that. Is that worth proposing in a different issue?

@karajan9
Copy link

Ah sorry, yes, I have completely misunderstood you 😄

@mkborregaard
Copy link
Member Author

mkborregaard commented Jan 27, 2019

Plots uses NaN under the hood, and missing gets converted to NaN. I don't see any problems adding support for converting Nothing to NaN in addition to the existing conversion, as there is no other obvious representation. So I'd be up for merging a PR implementing this.
But let's keep further discussion of this in a new issue, and leave this issue to discuss the technical issues of NaN/Missing support.

@astrojhgu
Copy link

I confirmed this in my Julia 1.0 environment.If I place a missing at the 1st element of an array, an error message will be given from the command Plots.plot([missing, 1., 2., 3.])

missing support works well for me, unless the data starts off with one:
plot([0, missing, 2]) works as expected while
plot([missing, 1, 2]) won't:

ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] macro expansion at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:185 [inlined]
 [2] apply_recipe(::Dict{Symbol,Any}, ::Type{Missing}, ::Missing) at /home/karajan/.julia/packages/RecipesBase/Uz5AO/src/RecipesBase.jl:275
 [3] _apply_type_recipe(::Dict{Symbol,Any}, ::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:197
 [4] macro expansion at /home/karajan/.julia/packages/Plots/rmogG/src/series.jl:253 [inlined]
 [5] apply_recipe(::Dict{Symbol,Any}, ::Array{Missing,1}) at /home/karajan/.julia/packages/RecipesBase/Uz5AO/src/RecipesBase.jl:275
 [6] _process_userrecipes(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Tuple{Array{Missing,1}}) at /home/karajan/.julia/packages/Plots/rmogG/src/pipeline.jl:83
 [7] macro expansion at ./logging.jl:305 [inlined]
 [8] _plot!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Tuple{Array{Missing,1}})at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:171
 [9] #plot#132(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:57
 [10] plot(::Array{Missing,1}) at /home/karajan/.julia/packages/Plots/rmogG/src/plot.jl:51
 [11] top-level scope at none:0

Any idea what's going on there?
I'm using Julia 1.0.2 and Plots 0.21.0 with GR 0.36.0

@isentropic
Copy link
Member

@BeastyBlacksmith I think this issue was taken care of recently with #2770

@BeastyBlacksmith
Copy link
Member

That PR just touched the documentation. What exactly did it fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants