-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add _real_ stacked area charts [feature request] #1217
Comments
What are the timescales for this requested feature? This is one reason why I've turned to HightCharts which does this perfectly. |
Been monitoring this one for a long time as well hoping to see some progress, all the workarounds I've seen are way too buggy to use in any production environment |
Here's a rundown of examples that show a reasonably well-specified way of dealing with misaligned |
/cc @alexcjohnson Excel: Google Sheets: Two different approaches to the drawing of lines where there is no data... i.e. in the second chart: Google Sheets draws a red line but Excel does not. Probably worth thinking carefully about the tradeoffs there, and ditto about what to show in hovers. |
I don't think there's actually a difference between the two: Excel isn't drawing lines at all, only fills, so there's nothing to omit, but presumably you can turn lines on and then I bet they would be drawn the same way as Google's. As @nicolaskruchten alludes to, the question of what to do with mismatched Google sheets (and Excel, at least by default, I haven't looked in detail) has a simpler data model than we do: every series shares the same Seems to me when we stack area charts we can internally fill in missing x values across all the stacked traces, and then there are perhaps three ways you might want to interpret gaps:
So if the second and third cases are covered by I guess I can imagine cases where that would be the "most correct" way to display the data, though it might be more complexity than users really want. On the other hand making a new setting for this would allow us to avoid turning |
I can't think of any other scenario. Thanks for writing those down in detail 👌
To my eyes, adding a new attribute for this is a no-brainer 👍 Booleans-plus-a-string aren't great, but also any enumerated attribute that has one or multiple values that can have no effect depending on other things in that trace should be avoided when possible. I was going to suggest That said, could this new attribute help us alleviate some current less-than-ideal fill problems? There are many open issues about this: #1132, #1867, #113, #1205 and possibly others. As an aside, this problem of mismatching x in stacked area chart appear very plotly specific. Both MATLAB and mpl assume the same independent coordinates for all their stacked area "y" arrays. |
Writing down some questions I have about stacked area hover:
So, perhaps we could add new |
mmm, there's some interaction between this issue and some of those - particularly #1132 and #1205 - but I think those are pretty much all implementation issues, not problems of specifying the desired behavior.
True, that's because both of those create all stacked lines in a single function call, so conceptually as a single object. We could in principle do the same, treating the entire stack as a single trace, but we can do better than that. I've certainly encountered plenty of situations like the population example I described, where I wanted to add up data that didn't come with matching x values, and I'd have loved it if this just worked ™️
I think the
😍 |
I have a concern about |
I'm a big fan of this. Adding flags
I agree 100% here. Moreover, we could add |
right - an "orphan point" we've called that in the past - it doesn't make a line segment either, so if you don't show markers you won't see anything. |
Yeah issues with orphan points go way back -> https://github.com/plotly/streambed/issues/2577 |
OK so what happens when you have a stack of areas with an orphan point in the middle (and you can't reorder because, say, they all have misaligned orphaned points)? |
If you're not filling gaps (with zeros or interpolations), then anything above a gap gets discarded - that's what I meant by "probably we'd want all gaps to propagate upward to the top of the stack." |
Wow, that seems... draconian. So much so that I'm not sure anyone would really want to use it? |
This wouldn't be the default for gaps introduced by the stacking process - the default would match gsheets and excel and fill with zeros, which if I'm interpreting your party/province plot right is probably what you'd want to have there, right? Missing items are not unknown data, they're cases of zero count. But I can certainly imagine doing an analysis and not wanting to make any assumptions about missing data, especially if that missing data is explicit in the data as an x with no/invalid y. And really the only way to do that is to throw out the unstackable data. |
I understand where you're going with this, certainly it makes sense from an SQL-like null-propagation point of view. One worry I have around both interpolate and gaps is that we support neither in our stacked-bar implementation as far as I know. One other salient point of comparison between stacked bars and stacked areas are handling of negative values. Google Sheets/Excel basically handles this by overlapping/"folding" the area downwards, which is sort of how our Google sheets: Excel: |
Final note for the weekend: it would be nice to have the equivalent of |
I'd be OK with Other than that, what could the API look like? Is this a new trace type? Would this be a |
👍 though in atttribute values we've tended to include spaces between words, so it could be
Absolutely - the goal here is just to make sure the API will support all the options we anticipate, but we can start with the default behavior.
Not great... you won't see them at all unless markers are displayed. Another option we could consider, that would be better for orphan points and perhaps alleviate my concern about the Highcharts behavior making it look like the neighbors are weird rather than the missing point itself: draw the fill halfway to the missing point before breaking it (probably following the same path that would be taken by APItldr this is what I'm proposing: data: [
{
type: 'scatter',
x: [...], y: [...],
stackgroup: '1', // this (any non-empty value) is what enables stacking
orientation: 'h', // like horizontal stacked bars - along with stackgroup this sets default fill attr
groupnorm: 'percent',
stackgaps: 'interpolate'
},
{
x: [...], y: [...],
stackgroup: '1',
orientation: 'h',
// groupnorm here would be ignored unless omitted above
stackgaps: 'infer zero'
}
],
layout: {
// could specify groupnorm, stackgaps here instead if uniform
} There would be advantages to making the whole stack into a single trace, with an array similar to
But I still think we're better off leaving this as a collection of
Re: a
Re: a
The latter is arguably more correct, as there's no ambiguity, there's exactly one place to specify one value. But it seems heavy and potentially confusing to users. Actually it becomes even more complicated with the planned extension for bars - if we have grouped stacks, you might want to normalize so each stack reaches 100%, or you might want to normalize so the sum of all stacks in each group is 100%. The former seems like the more natural (and more common) case, the latter could perhaps be another One more thought: do we want to allow stacking horizontally, not just vertically? Perhaps an More data issuesTwo more related questions about
{
x: [2,0,1,2,4,3,2], // out of order and 2 shows up 3x
y: [1,1,2,3,2,3,2], // values for x=2 are 1, 3, 2
stackgroup: '1'
}, {
x: [0,1,2,3,4], // no duplication, all in order, but we draw extra values at x=2
y: [2,3,3,2,2], // because otherwise what y do we stack onto?
stackgroup: '1'
} you'd get something like: |
I love the half-area rendering for Re duplicates and ordering, I would want to just stick with whatever we currently do for filling, which can lead to some crazy results, but at least we're not introducing a whole new way of filling... Re the With that done, we could look at tackling these limitations in a unified bar+scatter way. It seems to me like we want to introduce the notion of "sub stacks", especially when |
I don't think we can get away from doing something new here. If you have unordered data below and ordered above, you'll be stacking as though the values below were ordered, creating entirely new strange behavior. Anyway, when you're just filling but not stacking there are legitimate reasons to have unordered data, but it seems to me that when you're stacking you've made a strong statement that If you have duplicates below and unique values above, which one do you add onto? One way to arrive at the solution I gave above is to imagine the duplicate points start out at slightly different |
BTW the gradient on part of the orange fill in @nicolaskruchten 's comment above seems to be a Chrome + Mac Retina screen rendering bug - fiddling around with similar multiply-self-crossing paths I can get all manner of related errors on my laptop's main (retina) screen, but they all look fine when I put the window on my second monitor (non-retina) or in FF or Safari on the retina screen. I'm going to ignore it and hope Chrome fixes it. |
Throwing in my cents in decreasing order of importance:
data = [{
type: 'bar',
// ...
bargroup: '0',
// these below would apply to all traces of this bargroup
bargap: 0.1,
bargroupgap: 0.05,
barnorm: 'percent'
}, {
type: 'bar',
// ...
bargroup: '0',
// would not coerce 'bargap', 'bargroupgap, ...
// if not first trace in bargroup,
// Plotly.validate would pick this up!
}] which only adds one new attribute,
{
stackgroup: '1',
// only coerce if stackgroup is set
stack: {
orientation: 'v',
groupnorm: 0.1,
gaps: '...'
}
}
|
Yes, indeed my proposal was primarily motivated by symmetry. If we could implement I'm also fine with the sorting/not just reusing the existing fill behaviour. |
That's a pretty strong statement! But I think I can get behind it. Thinking through the details of individual use cases there are still a number of decisions to make, but I think we can work it out. That said...
My concern about this is its impact on reordering traces - which I suspect is fairly common in exactly these cases where traces within a group interact with each other. If I have a lot of stacked items there may be different ways I want to organize them, and if this resulted in moving the first trace out of its spot, I'd need to also move the group attributes to the new first trace. What if we just take the first value we find for these attributes, looking at every trace in the group, and apply that value to all of them in
I feel like symmetry with bar - when the function is the same which I think it is here - is worth a good deal, not just in terms of simplifying the editor as folks toggle between bar and area, but from a straight plotly.js user perspective as well, not having to learn more attribute names. Would it suffice to include in its description "applies only to stacked area traces"?
Again I feel like the function is the same so we should use a name that works for both bar and scatter. Right now we have Plotly.newPlot(gd,[{
x: [1,1,1,2,2,3], type: 'histogram', histnorm: 'probability'
},{
// eg 2 results in 50/50 because 2 is one third of the samples in each trace
x: [1,1,1,1,2,2,2,2,3,3,3,3], type: 'histogram', histnorm: 'probability'
}],{
barmode: 'stack', barnorm: 'percent'
}) So if
TBH I can't really figure out a stacking algorithm that would make sense without sorting, except I guess for the very top trace, so I think sorting needs to be baked in. |
Yeah, I'm aware 💪
Very good points here! You're absolutely right, taking the first value for find (as opposed to the value of the first trace in the group) is what we want to do. Moreover, perhaps these "group" attributes could be coerced even when data = [{
// ...
bargroup: '1',
bargap: 0.1
}, {
bargroup: '1'
}, {
bargroup: '1'
}] and toggling
These are valid points. You're right that trying to reuse same attribute names across trace types probably decreases the learning curve for users. As for answering when the attributes are valid, we should encourage users to look up the descriptions on https://plot.ly/javascript/reference/ and use Perhaps to make the applies only stacked area traces part more obvious in the attribute descriptions, we could add |
Closed by #2960 |
I'm filing this issue as a gathering point for the feature request of real stacked area charts.
The current solution to create stacked area charts is to plot cumulative variables which has multiple drawbacks:
text: '...'
.A real solution would be to have an argument like
layout = {linemode: 'stack'}
, same as there is for bar charts.I'm aware that @etpinard stated some time ago:
Therefore I hope opening this issue won't be seen as an annoying harassment! Since it is an often requested feature, I think it important to have a place for users like me to gather all relevant information and potential progress on this.
Some more information:
plotly.js
:Stacked Area Functionality #344plotly
forR
:Example code of stacked fill area plotly.R#686andadd _real_ stacked area charts [feature request] plotly.R#810ggplot2
and then converting it to plotly usingggplotly
doesn't work as supposed either (the plot doesn't get rescaled correctly if traces in between are unselected, there will just be space left).JS
): https://codepen.io/etpinard/pen/yOgdObThe text was updated successfully, but these errors were encountered: