Skip to content

Stat for aligning lines before stacking #4889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 7, 2022
Merged

Conversation

thomasp85
Copy link
Member

Fix #4850

This is a first stab at implementing what has been discussed in #4850.

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 2,
    "a", 3, 5,
    "a", 5, 1,
    "b", 2, 3,
    "b", 4, 6,
    "b", 6, 7
)
ggplot(df, aes(x, y, fill = g)) + 
  geom_area()

questions remaining to be answered:

  1. how should it behave when a group climbs on top of another? In the code suggestion in Helper stat for geom_area() #4850 we allow the data to rise from 0 outside of its range but I feel that is wrong - however the solution here is also not quite right...

  2. what is the best name for the stat? I'm not married to "align", but also think it is "good enough"... if anyone has a better name I'm open

@hadley @clauswilke @yutannihilation

@thomasp85
Copy link
Member Author

ok, I feel I have the best solution for (1.). We will add a very little amount (currently 0.1% of the range) to either side of each group and set it to zero - this makes the data rise suddenly when climbing on top of a new set for all intent and purpose

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 2,
    "a", 3, 5,
    "a", 5, 1,
    "b", 2, 3,
    "b", 4, 6,
    "b", 6, 7
)
ggplot(df, aes(x, y, fill = g)) + geom_area()

@thomasp85
Copy link
Member Author

hmm... however, this interacts badly with outline.type = "upper" since it make the outline start from 0 rather than where the data starts...

@yutannihilation
Copy link
Member

Another question. What should happen when there's a cliff in the middle of the data?

devtools::load_all("~/GitHub/ggplot2/")
#> ℹ Loading ggplot2

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 2,
    "a", 3, 5,
    "a", 5, 1,
    "b", 2, 3,
    "b", 4, 3,
    "b", 4, 6,
    "b", 6, 7
)


ggplot(df, aes(x, y, fill = g)) + 
  geom_area() +
  geom_point(data = \(x) dplyr::filter(x, g == "b")) +
  geom_line(data = \(x) dplyr::filter(x, g == "b"))
#> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
#> collapsing to unique 'x' values

Created on 2022-06-24 by the reprex package (v2.0.1)

@thomasp85
Copy link
Member Author

good catch - we could potentially nudge it in the same way as we do in the ends

@thomasp85
Copy link
Member Author

just need to make sure we nudge it in the right direction so the slope matches the evolution of the data

@hadley
Copy link
Member

hadley commented Jun 24, 2022

@thomasp85 could you just add the fudge factor to either side of every point?

@thomasp85
Copy link
Member Author

@hadley I'm not sure I understand? As opposed to what I'm currently doing? or to fix @yutannihilation issue?

@yutannihilation
Copy link
Member

I feel the result below isn't right, but is this what you meant in this comment...?

just need to make sure we nudge it in the right direction so the slope matches the evolution of the data

devtools::load_all("~/GitHub/ggplot2/")
#> ℹ Loading ggplot2

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 2,
    "a", 3, 5,
    "a", 5, 1,
    "b", 2, 3,
    "b", 4, 5,
    "b", 4, 3,
    "b", 6, 7
)

pal <- scales::hue_pal()(2)
names(pal) <- c("a", "b")

l <- list(
  coord_cartesian(xlim = c(1, 6), ylim = c(0, 9)),
  scale_fill_manual(values = pal)
)

p <- ggplot(df, aes(x, y, fill = g)) + 
  geom_area() +
  l

p_single <- lapply(c("a", "b"), function(g) {
  df_single <- dplyr::filter(df, g == {{ g }})
  
  ggplot(df_single, aes(x, y, fill = g)) + 
    geom_area() +
    geom_point() +
    geom_line() +
    l
})

patchwork::wrap_plots(a = p_single[[1]], b = p_single[[2]], both = p, ncol = 1)

Created on 2022-06-30 by the reprex package (v2.0.1)

@thomasp85
Copy link
Member Author

@yutannihilation yeah, I was probably trying to be too clever, sorting the points in the cliff so it matched the surrounding slope - but I agree it probably should be kept in the same order as the data

@thomasp85
Copy link
Member Author

fixed in the latest commit

@yutannihilation
Copy link
Member

Thanks, looks good!

Let me confirm about one last corner case; do we support when the areas go across zero? It seems to work if we interpolate the data points on y = 0, but I'm not sure if it always works. Besides, in general, I have no idea what's the correct behavior here.

devtools::load_all("~/GitHub/ggplot2/")
#> ℹ Loading ggplot2

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 1,
    "a", 2, 4,
    #"a", 2.5, 0,
    "a", 3, -4,
    "a", 8, 0,
    "b", 2, 4,
    #"b", 4, 0,
    "b", 6, -4
)

pal <- scales::hue_pal()(2)
names(pal) <- c("a", "b")

l <- list(
  coord_cartesian(xlim = c(1, 8), ylim = c(-10, 10)),
  scale_fill_manual(values = pal)
)

p <- ggplot(df, aes(x, y, fill = g)) + 
  geom_area(alpha = 0.5) +
  l

p_single <- lapply(c("a", "b"), function(g) {
  df_single <- dplyr::filter(df, g == {{ g }})
  
  ggplot(df_single, aes(x, y, fill = g)) + 
    geom_area() +
    geom_point() +
    geom_line() +
    l
})

patchwork::wrap_plots(p_single[[1]], p_single[[2]], p, ncol = 1)

Created on 2022-07-01 by the reprex package (v2.0.1)

@thomasp85
Copy link
Member Author

@yutannihilation thanks for coming up with all these edge cases. The last one should work now, though it is surely not a beautiful result :-)

df <- tibble::tribble(
    ~g, ~x, ~y,
    "a", 1, 1,
    "a", 2, 4,
    #"a", 2.5, 0,
    "a", 3, -4,
    "a", 8, 0,
    "b", 2, 4,
    #"b", 4, 0,
    "b", 6, -4
)

pal <- scales::hue_pal()(2)
names(pal) <- c("a", "b")

l <- list(
    coord_cartesian(xlim = c(1, 8), ylim = c(-10, 10)),
    scale_fill_manual(values = pal)
)

p <- ggplot(df, aes(x, y, fill = g)) + 
    geom_area(alpha = 0.5) +
    l

p_single <- lapply(c("a", "b"), function(g) {
    df_single <- dplyr::filter(df, g == {{ g }})
    
    ggplot(df_single, aes(x, y, fill = g)) + 
        geom_area() +
        geom_point() +
        geom_line() +
        l
})

patchwork::wrap_plots(p_single[[1]], p_single[[2]], p, ncol = 1)

Copy link
Member

@yutannihilation yutannihilation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect! As these cases are very tricky, I think it's a good idea to add one or two snapshot tests for them. Other than that, I have no comments.

@thomasp85
Copy link
Member Author

yeah, I agree - your examples provide a nice foundation for some visual tests

@thomasp85 thomasp85 merged commit 7484bd7 into main Jul 7, 2022
This was referenced Jul 7, 2022
@thomasp85 thomasp85 deleted the issue-4850-area-helper-stat branch July 26, 2022 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Helper stat for geom_area()
3 participants