Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation breaks with facets #3305

Closed
clauswilke opened this issue May 6, 2019 · 12 comments
Closed

Annotation breaks with facets #3305

clauswilke opened this issue May 6, 2019 · 12 comments
Milestone

Comments

@clauswilke
Copy link
Member

clauswilke commented May 6, 2019

As discussed here: #3303 (comment)

Annotations break when there are multiple facets and a non-position aesthetic/parameter is provided as a vector:

library(ggplot2)
data <- data.frame(x = c(1, 2), y = c(1, 2), facet = c("a", "b"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_grid(vars(facet))
p + annotate("text", x = 1.5, y = 1.5, label = c("label1", "label2"))
#> Error: Aesthetics must be either length 1 or the same as the data (4): label

Created on 2019-05-06 by the reprex package (v0.2.1)

I note that this problem does not occur when a position aesthetic is provided as a vector:

library(ggplot2)
data <- data.frame(x = c(1, 2), y = c(1, 2), facet = c("a", "b"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_grid(vars(facet))
p + annotate("text", x = c(1, 2), y = 1.5, label = "label")

Created on 2019-05-06 by the reprex package (v0.2.1)

As an additional hint to what may be happening, I note that the data frame that is internally being generated is twice as long as the input vector, as we can tell from the error message:

...same as the data (4)...

If we try to provide a longer label vector, we get an even longer data frame:

library(ggplot2)
data <- data.frame(x = c(1, 2), y = c(1, 2), facet = c("a", "b"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_grid(vars(facet))
p + annotate("text", x = 1.5, y = 1.5, label = c("label1", "label2", "label3", "label4"))
#> Error: Aesthetics must be either length 1 or the same as the data (8): label

Created on 2019-05-06 by the reprex package (v0.2.1)

I don't immediately know what the correct behavior here should be. Since we're not in any way specifying facets in the annotation, how should ggplot2 know that one label should be placed onto each facet, rather than both labels on each facet?

@clauswilke clauswilke added this to the ggplot2 3.2.0 milestone May 6, 2019
@clauswilke
Copy link
Member Author

Some additional thoughts: annotate() only transfers position values into the data frame it creates:

data <- new_data_frame(position, n = max(lengths))

Maybe it should poll the geom about the aesthetics it understands and move all those into the newly generated data frame. For reference, this is how layer() figures out the aesthetics of a geom:

aes_params <- params[intersect(names(params), geom$aesthetics())]

However, if we did this, the result would be that both labels would be placed onto each facet. Maybe that's why only position values are transferred? Because plotting multiple labels on top of the same position would be pointless? If that's the case, the correct fix may be a better error message.

@clauswilke
Copy link
Member Author

What I wrote in the previous post is wrong. If we treat, for example, color values as aesthetics, then they will be converted by the currently active color scale, and this will give unexpected results. I think the current code is reasonable, and where it falls short people just have to generate their own appropriate data frame and use the relevant geoms directly. In any case, there is no principled argument I can see why the result obtained under 3.1.1 (#3303 (comment)) is correct.

@paleolimbot
Copy link
Member

Yes, the 3.1.1 result is very odd...interesting that it was used twice! I will look into this tomorrow morning, but I think we need to create the data together so they have the same number of values in each vector, then split them up to position/non-position.

@clauswilke
Copy link
Member Author

My point was that I don't think we need to modify our code at all. We can just inform the maintainers of the two relevant packages that they need to modify their code, because they're relying on undefined behavior.

@paleolimbot
Copy link
Member

You're right...also ... could contain params rather than aesthetics. Failing with

#> Error: Aesthetics must be either length 1 or the same as the data (4):

seems confusing though, since the user doesn't pass anything that is length 4 to cause the error.

@clauswilke
Copy link
Member Author

@yutannihilation Do you have an idea where the data frame gets duplicated (so we end up with 4 rows even though we provide 2 values)? I think you recently worked on faceting, so I'm wondering whether have an idea where this is happening and if anything can be done to fix the behavior or improve the error message.

@paleolimbot
Copy link
Member

I had some time this morning to look into this, I hope this is ok. The long and short is that there isn't anything incorrect happening, but any layer with a set aesthetic with length > 1 just doesn't work with facets. It didn't always error before, but it wasn't correct either.

library(ggplot2)
data <- data.frame(x = c(1, 2), y = c(1, 2), facet = c("a", "b"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_grid(vars(facet))
p + annotate("text", x = 1.5, y = 1.5, label = c("label1", "label2"))
#> Error: Aesthetics must be either length 1 or the same as the data (4): label

The data are supposed to get repeated because there's two facets no facet variable:

ggplot2/R/facet-wrap.r

Lines 186 to 197 in 92d2777

if (length(missing_facets) > 0) {
to_add <- unique(layout[missing_facets])
data_rep <- rep.int(1:nrow(data), nrow(to_add))
facet_rep <- rep(1:nrow(to_add), each = nrow(data))
data <- plyr::unrowname(data[data_rep, , drop = FALSE])
facet_vals <- plyr::unrowname(cbind(
facet_vals[data_rep, , drop = FALSE],
to_add[facet_rep, , drop = FALSE]))
}

ggplot2/R/facet-grid-.r

Lines 266 to 276 in 92d2777

if (length(missing_facets) > 0) {
to_add <- unique(layout[missing_facets])
data_rep <- rep.int(1:nrow(data), nrow(to_add))
facet_rep <- rep(1:nrow(to_add), each = nrow(data))
data <- plyr::unrowname(data[data_rep, , drop = FALSE])
facet_vals <- plyr::unrowname(cbind(
facet_vals[data_rep, , drop = FALSE],
to_add[facet_rep, , drop = FALSE]))
}

The error is being caused because the "set" aesthetics are a different length (2) than the data (which is now 4).

check_aesthetics(params[aes_params], nrow(data))

This changed because of the new_data_frame() call

data <- new_data_frame(position, n = max(lengths))

which (correctly) creates a nrow 2 data frame. Previously this was a 1-row data frame that was getting repeated by the Facet$map_data(), and at the very end the non-position aesthetics were getting tacked on. If the non-position aesthetics were the right length, there were no errors.

This means any Layer that has a "set" aesthetic with length > 1 won't work with facets. A more valid example where this fails might be this, which draws a label in two places (and one would expect it to do so on each facet). This code errors in 3.1.1 as well.

library(ggplot2)
data <- data.frame(x = c(1, 1.5, 2), y = c(1, 1.5, 2), facet = c("a", "b", "c"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_wrap(vars(facet))
p + annotate(
  "text", 
  x = c(2, 1), 
  y = c(2, 1),
  label = c("label1", "label2")
)
#> Error: Aesthetics must be either length 1 or the same as the data (6): label

The problem isn't unique to annotations:

library(ggplot2)
data <- data.frame(x = c(1, 1.5, 2), y = c(1, 1.5, 2), facet = c("a", "b", "c"))
p <- ggplot(data, aes(x, y)) + geom_point() + facet_wrap(vars(facet))
p + geom_text(
  aes(x = x, y = y),
  data = data.frame(
    x = c(2, 1), 
    y = c(2, 1)
  ),
  label = c("label1", "label2")
)
#> Error: Aesthetics must be either length 1 or the same as the data (6): label

Fixing this wouldn't be easy for either annotations or layers in general.

@clauswilke
Copy link
Member Author

I still feel the second example is misusing the API. If we want different labels then the correct way to do this is to integrate them into the data frame and then the problem disappears. I suggest leaving our code as is and contacting the authors of the relevant packages to fix their code.

@clauswilke
Copy link
Member Author

Thanks!

@Benambridge
Copy link

Just chipping in to say that this broke my code which generated 270 plots! Needless to say, I will be rolling back to 2.2.1!

@lock
Copy link

lock bot commented Jan 1, 2020

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jan 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants