Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summarize(across(where)) fails when using a predicate and acts unexpectedly with predicate negation #271

Closed
roni-fultheim opened this issue Jul 13, 2021 · 2 comments · Fixed by #369

Comments

@roni-fultheim
Copy link

I have been trying to summarize across numeric columns using dtplyr and summarise(across(where(is.numeric))) and got the error can't rename variables in this context. It took me quite a while to figure out the reason for the error is the usage of a predicate within summarize is not possible with dtplyr.
Is there any substitute for using is.numeric besides for naming all columns?
Thank you

The following code 1) shows error when using predicate, 2) show how the negation does not fail and applies given function to all variables (numeric and non-numeric)

library(ggplot2)
library(data.table)
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)

# fails
diamonds %>% lazy_dt() %>% group_by(color) %>% summarise(across(!where(is.numeric), first))

# applies function to all variables
diamonds %>% lazy_dt() %>% group_by(color) %>% summarise(across(!where(is.numeric), first))

# unlike the equivalent statement using a data frame which only returns the relevant variables
diamonds %>% group_by(color) %>% summarise(across(!where(is.numeric), first))
@roni-fultheim roni-fultheim changed the title summarize(across()) fails when using a predicate and acts unexpectedly with predicate negation summarize(across(where)) fails when using a predicate and acts unexpectedly with predicate negation Jul 13, 2021
@mgirlich
Copy link
Collaborator

Unfortunately, these predicates do not work because the column type is (in general) not known. So, there is no other way than to explicitly name the columns to transform. Maybe other tidyselect helpers can help you there, for example something like

diamonds %>% 
  group_by(color) %>% 
  summarise(across(c(carat, depth:z), mean))

iris %>% 
  group_by(Species) %>% 
  summarise(across(matches("^Sepal|Petal")))

The error message is indeed not very helpful. When r-lib/tidyselect#226 is implement a more helpful error message can be implemented.

@cb12991
Copy link

cb12991 commented Sep 16, 2021

I've ran into the same issue as well. It took me a while to discover why across(where(is.character))) was not working with a lazy_dt. It's not the end of the world as you can use other tidyselect helpers mentioned above; just wondering if this was something that will be fixed in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants