Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Axes #373

Open
dcjones opened this issue Jul 25, 2014 · 12 comments
Open

Multiple Axes #373

dcjones opened this issue Jul 25, 2014 · 12 comments

Comments

@dcjones
Copy link
Collaborator

dcjones commented Jul 25, 2014

I'm going to create an issue for this and leave it open, since this comes up periodically, and I want to have conspicuous response.

I don't like plots with multiple axes. They're almost always terrible. Seriously, just do a google image search for “multiple axes”. You'll see some of the worst, most incomprehensible plots ever drawn. Stuff like this:

multiple-y-axes-1

That said, there are a handful of arguably legitimate uses, for example labeling a temperature scale in fahrenheit and celsius. So I'm not opposed to adding this feature, but there about a thousand more important things I need to do to make Gadfly great, and for me, this is very near the bottom of the list.

If someone wants to implement this, I can describe how to do it and review their PR, but I'm probably not going to do it myself within the next few years.

@mzalam
Copy link

mzalam commented Jul 25, 2014

I understand. btw where did you find this horrendous plot? What was your google keyword?

@timholy
Copy link
Collaborator

timholy commented Jul 25, 2014

That is quite a masterpiece. Brightened my morning. Thanks for sharing!

@tomasaschan
Copy link
Contributor

What would be the right way to implement this in Gadfly? I'm probably not the most capable person to do this, but since I'd like to be able to use the feature, I guess I'm up ;)

I have a very limited understanding (read: only slightly more than your cat) of how the inner workings of Gadfly are structured, so I'd probably require quite a lot of guidance to help with this - thus, free to ignore my offering and wait for someone more capable...

(And yes, that really was a hideous plot. I promise not to produce anything remotely similar if/when this feature lands!)

@dcjones
Copy link
Collaborator Author

dcjones commented Sep 29, 2014

Probably the right way to do it is to allow per-layer scales and coordinates. Currently scales are applied only at the plot level, so while layers can have their own statistics and geometry, they must share one scale. Conceptually it's simple, but it may involve rearranging a lot of stuff.

My other concern is per-layer scales can easily lead to garbage plots (like the one I posted). It's not Gadfly's job to prevent bad plots per se, but I'd like to find a way to make sure this isn't done accidentally. E.g. I could imagine someone innocently doing this

plot(layer(..., Scale.x_continuous), layer(..., Scale.x_continous))

and being surprised by the result.

@dcjones dcjones added the easy label Dec 23, 2014
@dcjones
Copy link
Collaborator Author

dcjones commented Dec 23, 2014

I was thinking about this more lately. I've seen some plots with multiple axes that I actually quite like. Consider this one from "Capital in the Twenty-First Century" (forgive the shitty cell-phone photo).

multiple-axis

This works because the two axes are just different ways of measuring the same thing, and there is a linear relationship between the two. It doesn't change the interpretation of the plot geometry. It also improves the plot, making it intuitive to both Amercians and Europeans.

In contrast, the multiple axes plots I hate are the ones in which the plot geometry no longer has a single interpretation. Here's something fairly typical.

multiple-axes-2

I think of plotting as a converting data into a language native to human brains: visual patterns. These plots are bad, or at least sub-optimal, because they introduce a lot of visual patterns that are meaningless. The "defective" line crosses the "cost" and "output" lines several times. That looks interesting, but has absolutely no meaning, since they measure different things and their relative positions on the y-axis are arbitrary. Yet the fact that France's minimum wage surpassed the US's in '84 looks interesting and is interesting.

There's this great bit in Howard Wainer's introduction to the 2010 edition of "Semiology of Graphics" (which is sort of a precursor/inspiration to "The Grammar of Graphics").

Thirty years ago I was enthusiastic and optimistic about the future of graphical use, for, I thought, software will be built that has sensible default options, so that when the software is set on maximal stupidity (ours not its), a reasonable graph would result. The software would force you to wring its metaphorical neck if you wanted to produce some horrible pseudo 3-D multicolored pie chart. Alas, I couldn't have been more wrong. Instead of making wise, evidence-based, choices, default options seem to have been selected by the same folks who deny the holocaust, global warming and evolution. I could not have imagined back then that the revolution in data gathering, analysis and display that was taking place in the last decades of the 20th century would have resulted in the complexity of the modern world being conveyed in bullet-points augmented by PowerPoint and Excel graphics.

The right think to do here is neither to ban multiple-axes plots nor to begrudgingly implement them, but to articulate the dichotomy between meaningful multiple-axes plots, and garbage ones, then make the former easy to draw and the latter possible only with amble neck wringing. So, here's a proposal:

Plots will always have a native unit. Additional axes can be added, but only by specifying a conversion from the original units. So the first example in Gadfly would be drawn like:

eur_to_usd = 1.32

plot(x=year, y=minimum_wage_in_euros, color=country,
     Geom.point, Geom.line, Guide.y_axis(euros -> eur_to_usd * euros))

This is pretty easy to implement. It will also still allow (what I think are) garbage multiple-axes plots, but forces the user to endure the psychological trauma of passing a meaningless conversion function to Guide.y_axis. Good multiple-axes plots have a very meaningful conversion function: like the one converting 2010 euros to dollars.

TLDR: On second thought, I should add multiple axes plots.

@tomasaschan
Copy link
Contributor

Your API example lacks one important aspect of these plots: the raw data for the minimum wage plot, would likely be available in € for the French wages, and in $ for the American. It would therefore be a lot easier to do this if one specified which data set goes on which axes, too, as well as a conversion factor between them. Maybe something like

eur_to_usd = 1.32

plot(min_wages, x=:year, y1=:france, y2=:usa,
     Geom.point, Geom.line, Guid.y_relation(y2 = y1 -> eur_to_usd * y1))

Now it's well-defined which axis (left or right, y1 or y2) corresponds to which data, which also makes it clearer how the function mapping one to the other should be specified. (If we switch out the US data for England, would the order be preserved? If so, why? If not, the anonymous function euros->usd is just black magic...)

@st33v3
Copy link

st33v3 commented Sep 4, 2015

I generally agree that multiple scales (axes) are visually problematic. And I would be against adding scale to layers, but I would propose different approach. In some cases it still may be useful to present two "categories" in one axis, but just two, because a plot has two vertical and two horizontal edges.

For example - when data comes from stress testing of a web application. There is one dataset corresponding to number of virtual users visiting the application and other one corresponding to response time of the application. It usually happens that shape (or envelope) of response times follows number of virtual users. It is useful to display number of VU together with reply times. Obviously, VU number and time have different scale and units.

For that case I would recommend to introduce notion of secondary (minor) axes (both x and y) that have its Scale (Scale.x2_continuous, ...) and aesthetics - e.g.

plot(dataset, x="Month",  y2="Approve", Scale.x2_continuous)

Labels and guides would be configured separately for x and x2 axis. I believe that this approach is more Gadfly like.

@iq2luc
Copy link

iq2luc commented Dec 19, 2018

Sometimes multiple Y axes (well... two) are important and very useful, for example looking at the attached plot one could easily understand if there is a capacitive or inductive load (source data is from a real capture, in this case there is a capacitve load because the current leads the voltage).
vi-phase1

@ph-pi
Copy link

ph-pi commented Aug 29, 2019

Double axis (and even more) are often used with time series. iq2luc give a pretty good example.
Financial time series also produces a lots of indicators zero based or significant when zero crossing.

At the moment, I use vstack, but it is not easy to compare precisely 2 or more plots.

I hope this feature will be available soon...

@dehann
Copy link

dehann commented May 3, 2020

and I want to have conspicuous response

Hi, just ran into this too and would also like to be able to do dual axis plots. Perhaps the comment at least helps build support to add this...

It's not Gadfly's job to prevent bad plots

For the case we looking now it is sensible to have "non-scalable" values plotted on the same x-axis but left and right y-axes -- i.e. some population time series together with environmental temperature.

@Mattriks Mattriks mentioned this issue May 9, 2020
5 tasks
@henriquebecker91
Copy link

I would like to give another example of a case against the limited notion multiple axis are only "good" if there is a linear relationship between the two (or more) y-measurements. I would like to create a plot that has both the CPU clock and temperature. The relationship between them is not guaranteed to be linear, even when they are not blocked by its maximum values, and surely the most interesting thing to observe is that, if temperature reaches a plateau, the clock keeps stable or it starts to degrade because of thermal throttling.

@BASS10
Copy link

BASS10 commented Feb 9, 2022

Wow, I'm surprised that this isn't implemented. I'm even more surprised by the hostile attitude many people seem to have towards this. I can make un ugly misleading plot with one axis just as easily as I can make a nice one with 2 axes. That's an hour of my life I'll never get back. Later Gadfly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants