-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowwise mutate across with c_across computes incorrectly for some but not all the columns (worked in dplyr 1.0.3 but not in 1.0.4) #5770
Comments
Expected result for tbl2: |
I just tried dplyr 1.0.4.9000 and it still appears to be broken. And I tried dplyr 1.0.3 and it worked fine there too. |
I suspect the problem is this: for |
🤔 this looks like a bug in the instrumentation of library(palmerpenguins); library(tidyverse)
(tbl1 <- penguins %>%
group_by(species, island, sex) %>%
summarize(n = n()) %>%
pivot_wider(names_from = sex, values_from = n))
#> `summarise()` has grouped output by 'species', 'island'. You can override using the `.groups` argument.
#> # A tibble: 5 x 5
#> # Groups: species, island [5]
#> species island female male `NA`
#> <fct> <fct> <int> <int> <int>
#> 1 Adelie Biscoe 22 22 NA
#> 2 Adelie Dream 27 28 1
#> 3 Adelie Torgersen 24 23 5
#> 4 Chinstrap Dream 34 34 NA
#> 5 Gentoo Biscoe 58 61 5
tbl1 %>%
ungroup() %>%
rowwise(species, island) %>%
mutate(across(everything(), ~ . / sum(c_across(everything()), na.rm = T)))
#> # A tibble: 5 x 5
#> # Rowwise: species, island
#> species island female male `NA`
#> <fct> <fct> <dbl> <dbl> <dbl>
#> 1 Adelie Biscoe 0.5 0.978 NA
#> 2 Adelie Dream 0.482 0.950 0.411
#> 3 Adelie Torgersen 0.462 0.808 0.797
#> 4 Chinstrap Dream 0.5 0.986 NA
#> 5 Gentoo Biscoe 0.468 0.918 0.783
tbl1 %>%
ungroup() %>%
rowwise(species, island) %>%
mutate(+across(everything(), ~ . / sum(c_across(everything()), na.rm = T)))
#> # A tibble: 5 x 5
#> # Rowwise: species, island
#> species island female male `NA`
#> <fct> <fct> <dbl> <dbl> <dbl>
#> 1 Adelie Biscoe 0.5 0.5 NA
#> 2 Adelie Dream 0.482 0.5 0.0179
#> 3 Adelie Torgersen 0.462 0.442 0.0962
#> 4 Chinstrap Dream 0.5 0.5 NA
#> 5 Gentoo Biscoe 0.468 0.492 0.0403 Created on 2021-02-18 by the reprex package (v0.3.0) I'll have a look, another approach would be to instead take the result from library(palmerpenguins); library(tidyverse)
(tbl1 <- penguins %>%
group_by(species, island, sex) %>%
summarize(n = n()) %>%
pivot_wider(names_from = sex, values_from = n))
#> `summarise()` has grouped output by 'species', 'island'. You can override using the `.groups` argument.
#> # A tibble: 5 x 5
#> # Groups: species, island [5]
#> species island female male `NA`
#> <fct> <fct> <int> <int> <int>
#> 1 Adelie Biscoe 22 22 NA
#> 2 Adelie Dream 27 28 1
#> 3 Adelie Torgersen 24 23 5
#> 4 Chinstrap Dream 34 34 NA
#> 5 Gentoo Biscoe 58 61 5
scale <- function(x) {
x / sum(x, na.rm = TRUE)
}
tbl1 %>%
ungroup() %>%
rowwise(species, island) %>%
mutate(
scale(across(everything()))
)
#> # A tibble: 5 x 5
#> # Rowwise: species, island
#> species island female male `NA`
#> <fct> <fct> <dbl> <dbl> <dbl>
#> 1 Adelie Biscoe 0.5 0.5 NA
#> 2 Adelie Dream 0.482 0.5 0.0179
#> 3 Adelie Torgersen 0.462 0.442 0.0962
#> 4 Chinstrap Dream 0.5 0.5 NA
#> 5 Gentoo Biscoe 0.468 0.492 0.0403 Created on 2021-02-18 by the reprex package (v0.3.0) |
Actually, there seems to be a confusion between library(dplyr, warn.conflicts = FALSE)
df <- tibble(x = 2, y = 4, z = 8)
df %>% mutate_all(~ .x / y)
#> # A tibble: 1 x 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 0.5 1 8
df %>% mutate(across(everything(), ~ .x / y))
#> # A tibble: 1 x 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 0.5 1 8
# these do:
df %>% mutate(x = x / y, y = y / y, z = z / y)
#> # A tibble: 1 x 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 0.5 1 8
df %>% mutate(+across(everything(), ~ .x / y))
#> # A tibble: 1 x 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 0.5 1 2
# but this does:
df %>% mutate(data.frame(x = x / y, y = y / y, z = z / y))
#> # A tibble: 1 x 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 0.5 1 2 Created on 2021-02-18 by the reprex package (v0.3.0) I believe that the last results are correct, and that the instrumented |
Thanks for addressing and fixing this! |
I was trying to compute proportions using rowwise, mutate, across, and c_across, but I'm only getting expected results in the female column and not in the male or
NA
columns.Created on 2021-02-18 by the reprex package (v1.0.0)
Session info
The text was updated successfully, but these errors were encountered: