Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-Float64 number types #77

Closed
Keno opened this issue Sep 19, 2013 · 12 comments
Closed

Support non-Float64 number types #77

Keno opened this issue Sep 19, 2013 · 12 comments

Comments

@Keno
Copy link
Collaborator

Keno commented Sep 19, 2013

It seems to me that sometimes Gadfly automatically scales the plot (e.g. loglog or something like that), but the y-axis label still says f(x). When such an automatic transformation is applied and default labels are used, that should be fixed automatically.

Relatedly, scale transforms currently expect the number to have a conversion to Float64. It would be nice to get rid of that as the rest of Gadfly is already blissfully ignorant to units, allowing things like

plot(h->mgh,0Meter,12Meter)

using my SIUnits package. However, this doesn't work if a scale transform (even the identity transform) is being applied.

EDIT: Title updated to reflect real issue (see below).

@Keno
Copy link
Collaborator Author

Keno commented Sep 21, 2013

It appeared to me that you have probably no idea what I'm talking about and I actually made a mistake in my analysis, so let me try to explain what I did. I wanted to plot the following:
screen shot 2013-09-20 at 11 44 40 pm
The data is wrapped in a custom data type that keeps track of units. Note that the tick marks on the y axis are not linear (I originally made a mistake where I thought the function returned a Float64, but in fact it returned one wrapped in a type of Unit 1 - thus the confusion). So I guess basically this issue is a request to be able to do basic plotting with arbitrary number types.

Now, another issue that came up in the course of testing this is that for small values the y axis is a little messed up:
screen shot 2013-09-20 at 11 46 36 pm

If I multiply by 10^9 this is what it looks like:
screen shot 2013-09-20 at 11 46 43 pm

Sorry for being so unprecise in the original issue @dcjones. I was in a bit of a hurry trying to get the thing the graphs were for written up.

@dcjones
Copy link
Collaborator

dcjones commented Sep 21, 2013

Ok, I see what you're getting at, thanks for clarifying.

It doesn't recognize your custom number type as a number and treats it as categorical data, hence the messed of axis. Would it be sufficient to support custom numbers types by converting them to Float64 internally, or are you looking for something more sophisticated?

The rounding of small numbers to zero is clearly a bug. Should be easy to fix though.

@Keno
Copy link
Collaborator Author

Keno commented Sep 21, 2013

I would prefer not to convert everything to Float64, since I would like the axis labels to include untits (i.e. displaying Units). Would it be possible to have some sort of <: Number check? If you do want to convert to float64, I ask that you do it via the float64 method rather than convert(Float64,) since I don't define an implicit convert method.

@dcjones
Copy link
Collaborator

dcjones commented Sep 21, 2013

Now that I think of it, converting to Float64 is also a dead end if I want to be able to correctly plot e.g. BigFloat numbers. It will take some work to make the change the right way, but I'll see what I can do.

@Keno
Copy link
Collaborator Author

Keno commented Sep 21, 2013

Good point.

@dcjones
Copy link
Collaborator

dcjones commented Oct 13, 2013

I've just pushed a bunch of changes to Compose and Gadfly that should make this possible. It should allow plotting any type as a number as long as a handful of functions are defined for that type.

Here's the example I was playing with.

using Gadfly

import Base: show, +, -, /, *, isless, one, zero, isfinite

# A type I'd like to be able to plot
immutable Percent
    value::Float64
end

# Functions necessary for plotting

+(a::Percent, b::Percent) = Percent(a.value + b.value)
-(a::Percent, b::Percent) = Percent(a.value - b.value)
-(a::Percent) = Percent(-a.value)
*(a::Percent, b::Float64) = Percent(a.value * b)
*(a::Float64, b::Percent) = Percent(a * b.value)

# Must return something that can be converted to Float64 with float64(a/b)
/(a::Percent, b::Percent) = a.value / b.value

isless(a::Percent, b::Percent) = isless(a.value, b.value)
one(::Type{Percent}) = Percent(0.01)
zero(::Type{Percent}) = Percent(0.0)
isfinite(a::Percent) = isfinite(a.value)
show(io::IO, p::Percent) = print(io, round(p.value, 4), "%")

y=[Percent(0.1), Percent(0.2), Percent(0.3)]
plot(x=collect(1:length(y)), y=y)

Percent

Let me know if this works for what you were trying to do, or needs some tweaking.

@Keno
Copy link
Collaborator Author

Keno commented Nov 3, 2013

Sorry for now getting back to you on this earlier. This works great now. Here's a fun example:
http://nbviewer.ipython.org/7294889

The "just works" on that one was pretty great!

@Keno
Copy link
Collaborator Author

Keno commented Nov 13, 2013

I have found one problem with this, but I'm not quite sure what the solution is. Note the printing of 0.3. What do you do in Gadfly, to get around that?

screen shot 2013-11-12 at 10 59 57 pm

@Keno
Copy link
Collaborator Author

Keno commented Nov 13, 2013

Also while we're on the above plot example, is there any way to specify that more concisely (e.g. y=["vo1","vo2","in","in2"]). Also, is there any way I can choose the line color on a per-layer basis? I'd like the first and the third layer to have the color if possible (and equivalently the second and fourth). Or should I just somehow rearrange the dataframe to make this work (I guess, I could consolidate all the columns into one, adding a column keeping track of where it came from and then basing the color attribute of that, but that seems a little cumbersome)

@dcjones
Copy link
Collaborator

dcjones commented Nov 13, 2013

For regular floating point numbers, there is a function (in format.jl) that tries to print numbers uniformly and at a reasonable precision. I'm not sure how to extend that to other types though, so they just get "string" called on them.

The only solution now is to import Gadfly.formatter and define your own formatter(values::T...; fmt=nothing) function for type T that returns a function f(x::T) that returns a string. (The idea is that formatter "trains" a function for formatting values). Not so pretty...

Also while we're on the above plot example, is there any way to specify that more concisely (e.g. y=["vo1","vo2","in","in2"]).

Not really. I was thinking of introducing something like that, but probably with tuples to avoid ambiguity with plotting arrays directly.

I guess, I could consolidate all the columns into one, adding a column keeping track of where it came from and then basing the color attribute of that, but that seems a little cumbersome

That's basically how you have to do it now. I'm seeing the need to make that more convenient. I could imagine this being written like:

plot(datadm,
     x="in",
     y=("vol1", "vol2", "in", "in2"),
     color=(["A"], ["B"], ["C"], ["D"])
     Geom.line)

Then letting plot handle all the concatenating and such.

@dcjones
Copy link
Collaborator

dcjones commented Nov 13, 2013

I was going to say that transforming the data frame to be suitable for plot is made easier with the melt function in DataFrames, but I just tried that and it went into an infinite loop.

@dcjones
Copy link
Collaborator

dcjones commented Jun 21, 2014

I think the main thing that was remaining here was making formating non-float types easier. I've forked that into a new issue (#329) and am closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants