You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that the summarise function used within a dtplyr pipline is ignoring the .groups argument and instead creating a new ".groups" column. When the .groups argument is left out, the resulting tibble doesn't retain any grouping, while base dplyr retains grouping for the first grouping column.
library(dplyr)
#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, union
library(dtplyr)
#> Warning: package 'dtplyr' was built under R version 4.0.5# Datadata<-tibble(group1=rep(1:2, each=3), group2=rep(3:4, each=3), value=1:6)
# With regular dplyrdata%>%
group_by(group1, group2)%>%
summarise(value_mean=mean(value), .groups="drop")
#> # A tibble: 2 x 3#> group1 group2 value_mean#> <int> <int> <dbl>#> 1 1 3 2#> 2 2 4 5# With dtplyr, .groups argument is ignored and turned into a new columndata%>%
lazy_dt()%>%
group_by(group1, group2)%>%
summarise(value_mean=mean(value), .groups="drop")%>%
as_tibble()
#> # A tibble: 2 x 4#> group1 group2 value_mean .groups#> <int> <int> <dbl> <chr> #> 1 1 3 2 drop #> 2 2 4 5 drop# With dtplyr without the .groups argument, grouping is still removeddata%>%
lazy_dt()%>%
group_by(group1, group2)%>%
summarise(value_mean=mean(value))%>%
as_tibble()
#> # A tibble: 2 x 3#> group1 group2 value_mean#> <int> <int> <dbl>#> 1 1 3 2#> 2 2 4 5# Whereas the same pipeline with just dplyr would preserve groupingdata%>%
group_by(group1, group2)%>%
summarise(value_mean=mean(value))
#> `summarise()` has grouped output by 'group1'. You can override using the `.groups` argument.#> # A tibble: 2 x 3#> # Groups: group1 [2]#> group1 group2 value_mean#> <int> <int> <dbl>#> 1 1 3 2#> 2 2 4 5
I added support for the .groups argument in this PR.
Note that as_tibble() drops the grouping. This was a breaking change in tibble 2.0, see News. To keep the grouping you have to use collect().
I've noticed that the
summarise
function used within a dtplyr pipline is ignoring the .groups argument and instead creating a new ".groups" column. When the .groups argument is left out, the resulting tibble doesn't retain any grouping, while base dplyr retains grouping for the first grouping column.Created on 2021-05-18 by the reprex package (v2.0.0)
Session info
The text was updated successfully, but these errors were encountered: