You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
d3.stack is designed to work with non-tidy data where each row corresponds to a “group” (the set of observations for all layers, e.g., year) with properties for each “layer” a.k.a. series (e.g., format) recording the observed value (e.g., revenue).
Year
8 - Track
Cassette
Cassette Single
1973
2699600000
419600000
0
1974
2730600000
433600000
0
In the tidy format, in contrast, rows correspond to observations and columns correspond to variables. (This is less efficient as the layer names are repeated, but oh well.)
Year
Format
Revenue
1973
8 - Track
2699600000
1973
Cassette
419600000
1973
Cassette Single
0
1974
8 - Track
2730600000
1974
Cassette
433600000
1974
Cassette Single
0
It’s possible to use tidy data with d3.stack, but it’s a little convoluted.
Here the key accessor would return a two-part key: the layer key and the group key. And the value accessor wouldn’t need to know the current keys. (Because the data is tidy, the value accessor is the same for all observations.)
An implication of the proposed design is that the data can be sparse: some layers may be missing observations for some groups (and equivalently vice versa). That’s not possible with the current design because the layer keys (stack.keys) and group keys (data) are specified as separate arrays, but it should be easy enough for d3.stack to compute the union of layer keys and the union of group keys to fill in the missing data. d3.stack probably will also need some facility for ordering the group keys, as the order may not be consistent across layers.
I imagine it’ll be difficult to make this backwards-compatible, but maybe it’s possible, or maybe it could be under a new name such as d3.stackTidy.
The text was updated successfully, but these errors were encountered:
d3.stack is designed to work with non-tidy data where each row corresponds to a “group” (the set of observations for all layers, e.g., year) with properties for each “layer” a.k.a. series (e.g., format) recording the observed value (e.g., revenue).
In the tidy format, in contrast, rows correspond to observations and columns correspond to variables. (This is less efficient as the layer names are repeated, but oh well.)
It’s possible to use tidy data with d3.stack, but it’s a little convoluted.
It’d be nice if were more convenient to give d3.stack tidy data, say like so:
Here the key accessor would return a two-part key: the layer key and the group key. And the value accessor wouldn’t need to know the current keys. (Because the data is tidy, the value accessor is the same for all observations.)
An implication of the proposed design is that the data can be sparse: some layers may be missing observations for some groups (and equivalently vice versa). That’s not possible with the current design because the layer keys (stack.keys) and group keys (data) are specified as separate arrays, but it should be easy enough for d3.stack to compute the union of layer keys and the union of group keys to fill in the missing data. d3.stack probably will also need some facility for ordering the group keys, as the order may not be consistent across layers.
I imagine it’ll be difficult to make this backwards-compatible, but maybe it’s possible, or maybe it could be under a new name such as d3.stackTidy.
The text was updated successfully, but these errors were encountered: