Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Area Charts #2960

Merged
merged 14 commits into from
Sep 7, 2018
Merged

Stacked Area Charts #2960

merged 14 commits into from
Sep 7, 2018

Conversation

alexcjohnson
Copy link
Collaborator

We held out a long time on this one, but stacked area charts are finally coming to plotly.js.

The API is as discussed in #1217:

  • Provide matching stackgroup attributes to some scatter traces and they become a stack.
  • There are no plot-wide stacking attributes; stack-wide attributes are in the trace definitions, and we'll take a value for each attribute from the first trace in the stack that contains that attribute, visible or not (so the stack doesn't fall apart if you hide the first trace). This is different from, and more powerful than, how we describe bar stacking/grouping - and should be reviewed with an eye toward eventually using a similar framework for bars.
  • The data for all traces in the stack are sorted by position, and gaps in each trace are filled in either with zeros or interpolations
  • The one item from add _real_ stacked area charts [feature request] #1217 I did not include here is stackgaps: 'interrupt'. That's going to require some finicky drawing code, particularly if we want to support arbitrary line.shape so I'll leave it for later. But 'infer zero' (default) and 'interpolate' are included here.
  • Another open item is to improve hover info. What I did here matches stacked bars, but both of them, particularly if you normalize the results, would benefit from more options - normalized vs raw data, (sub)totals.

In order to make it work well in various edge cases I made a number of preparatory changes:

  • Lib.sort c87ccb3 wraps the built-in Array.sort with a check for whether the array is already perfectly sorted (or perfectly reversed), that for arrays of length 1e5+ can be a 10x or better speedup for already-sorted arrays, and should have very little penalty for unsorted arrays. For stacked area I expect the vast majority of the time the data will already be sorted, so that's why I implemented this now and this is the only place I used it, but I bet there are other places it would be useful as well.
  • Some edge case improvements in autorange 1f4898c - I changed a few baseline images (and one mock), I hope you'll agree these were actually incorrect before.
  • Better ordering of hover labels when traces have matching data (such as in a bar or area stack when one trace is zero) - try to preserve the stacking order 2fde3dc
  • Continue lines off the edge toward invalid log values 68b489d - I think I hadn't done this before (for scatter) out of caution lest we draw something misleading, I opted to just not draw the line at all. But particularly with fills, and even more so with stacked fills, this gets confusing and misleading as the fills would just connect across the missing point(s). I opted to draw these lines straight toward the edge if one dimension went invalid (since in principle they're going infinitely far away) or at a slope of 1 on a log/log plot if both dimensions go invalid simultaneously. Note that there are cases here where a separate point will move across these lines if you flip between linear and log axes, but that was already possible with finite data; this is just an extreme case of the same. (note the axes_range_type baseline change belongs in this commit but I put it in the autorange commit instead)

cc @etpinard @antoinerg @nicolaskruchten

@@ -75,7 +75,6 @@
{
"x": [1.5],
"y": [1.25],
"fill": "tonexty",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this change I think is actually required due to a change in the stacked area commit be38e93#diff-33c02cd37e7a4c951059a3c93221ac4eR175 - we were accidentally treating a length-1 trace as filling to itself (since its start and end points are the same!) but we shouldn't do that... therefore this trace, since it's the first on its subplot, should interpret 'tonexty' as 'tozeroy'.

var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
var firstScatter = fullLayout._firstScatter;
if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, scatter_fill_corner_cases top subplots were also prevented from filling to zero with 'tonexty' because only one subplot could have the "first scatter" trace on it. This commit 🔪 gd.firstscatter and replaces it with one trace (uid) per subplot, attached to fullLayout. The stacked area mocks 🔒 this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

if(cd.length !== serieslen) {
// TODO: verify this never happens and remove
throw new Error('length mismatch!');
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed this one... well, when I test ^^ I'll have plenty of confidence to remove it 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪 in 8547cf8

// if we're stacking, "infer zero" gap mode gets markers in the
// gap points - because we've inferred a zero there - but other
// modes (currently "interpolate", later "interrupt" hopefully)
// we don't draw generated markers
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard @nicolaskruchten do you agree with this choice? It only applies to points we generate in one trace to match the positions from another trace - those are the "gap points"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you agree with this choice?

+1 for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

@alexcjohnson alexcjohnson added bug something broken feature something new status: reviewable labels Sep 1, 2018
Copy link
Contributor

@etpinard etpinard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR!

I hope implementing those per-trace stack* attributes wasn't too much of a headache. The two stackgaps modes are looking great. 📈

Most of my comments are simply comments, with the exception of:

  • I don't think we need that alwaysSupplyDefaults trace module category
  • mutating gd.calcdata[i][j].i isn't great.
  • is that hacky fill default logic really necessary?
  • what do you think adding a 'stack' flag to scatter mode

src/plots/cartesian/autorange.js Show resolved Hide resolved
test/image/mocks/log_lines_fills.json Show resolved Hide resolved
src/lib/search.js Show resolved Hide resolved
src/plots/plots.js Show resolved Hide resolved
src/traces/bar/layout_attributes.js Show resolved Hide resolved
var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
var firstScatter = fullLayout._firstScatter;
if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

src/traces/scatter/calc.js Show resolved Hide resolved
src/traces/scatter/cross_trace_calc.js Show resolved Hide resolved
// if we're stacking, "infer zero" gap mode gets markers in the
// gap points - because we've inferred a zero there - but other
// modes (currently "interpolate", later "interrupt" hopefully)
// we don't draw generated markers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you agree with this choice?

+1 for me.

src/traces/scatter/stack_defaults.js Show resolved Hide resolved
@nicolaskruchten
Copy link
Contributor

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

@alexcjohnson
Copy link
Collaborator Author

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

It's unusual for sure, but I wouldn't want to rule it out. What if you have one stack series for data and another for fits? Then one stack would need its fill removed, since they're overlapping. Or prediction/extrapolation - these might not overlap but still you might want different styling for corresponding items in each stack. Or two back-to-back stacks, like those plots that have male on one side and female on the other, with the axis in the middle (we could manage that one with an analog of barmode: 'relative', or perhaps even better two axes with a constraint so you don't need to flip your data... but you see the point)

What do you think about adding a 'stack' flag to scatter mode to make it easier to toggle stacked areas on and off?

mode is otherwise all about how to draw the series, not where to draw it... and the one bit of how that stacking impacts (fill) isn't even part of mode.

But, perhaps both of these concerns could be assuaged by making a boolean stack attribute, then giving stackgroup a default value but only coercing it when stack is true? (and for completeness, if you provide only a stackgroup let stack default to true). That way the usual behavior would be to just use the boolean but the full flexibility would still be available (if perhaps buried in the UI)

@nicolaskruchten
Copy link
Contributor

OK, I'll buy the "back to back stacks" argument :)

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

@nicolaskruchten
Copy link
Contributor

I think we can live without an extra stack attribute personally :)

@alexcjohnson
Copy link
Collaborator Author

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

Within subplots (and within stack groups, if there are multiple on one subplot) -> 00d7d22

@etpinard etpinard added this to the v1.41.0 milestone Sep 6, 2018
@etpinard
Copy link
Contributor

etpinard commented Sep 6, 2018

Down to 2️⃣ unresolved comments:

@etpinard
Copy link
Contributor

etpinard commented Sep 7, 2018

Nicely done 💃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken feature something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants