Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summarise ignoring .groups argument #245

Closed
sbashevkin opened this issue May 19, 2021 · 1 comment · Fixed by #265
Closed

summarise ignoring .groups argument #245

sbashevkin opened this issue May 19, 2021 · 1 comment · Fixed by #265

Comments

@sbashevkin
Copy link

I've noticed that the summarise function used within a dtplyr pipline is ignoring the .groups argument and instead creating a new ".groups" column. When the .groups argument is left out, the resulting tibble doesn't retain any grouping, while base dplyr retains grouping for the first grouping column.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(dtplyr)
#> Warning: package 'dtplyr' was built under R version 4.0.5

# Data
data<-tibble(group1=rep(1:2, each=3), group2=rep(3:4, each=3), value=1:6)

# With regular dplyr
data%>%
    group_by(group1, group2)%>%
    summarise(value_mean=mean(value), .groups="drop")
#> # A tibble: 2 x 3
#>   group1 group2 value_mean
#>    <int>  <int>      <dbl>
#> 1      1      3          2
#> 2      2      4          5

# With dtplyr, .groups argument is ignored and turned into a new column
data%>%
    lazy_dt()%>%
    group_by(group1, group2)%>%
    summarise(value_mean=mean(value), .groups="drop")%>%
    as_tibble()
#> # A tibble: 2 x 4
#>   group1 group2 value_mean .groups
#>    <int>  <int>      <dbl> <chr>  
#> 1      1      3          2 drop   
#> 2      2      4          5 drop

# With dtplyr without the .groups argument, grouping is still removed
data%>%
    lazy_dt()%>%
    group_by(group1, group2)%>%
    summarise(value_mean=mean(value))%>%
    as_tibble()
#> # A tibble: 2 x 3
#>   group1 group2 value_mean
#>    <int>  <int>      <dbl>
#> 1      1      3          2
#> 2      2      4          5

# Whereas the same pipeline with just dplyr would preserve grouping
data%>%
    group_by(group1, group2)%>%
    summarise(value_mean=mean(value))
#> `summarise()` has grouped output by 'group1'. You can override using the `.groups` argument.
#> # A tibble: 2 x 3
#> # Groups:   group1 [2]
#>   group1 group2 value_mean
#>    <int>  <int>      <dbl>
#> 1      1      3          2
#> 2      2      4          5

Created on 2021-05-18 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/Los_Angeles         
#>  date     2021-05-18                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source        
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
#>  backports     1.2.1   2020-12-09 [1] CRAN (R 4.0.3)
#>  cli           2.5.0   2021-04-26 [1] CRAN (R 4.0.5)
#>  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.3)
#>  data.table    1.14.0  2021-02-21 [1] CRAN (R 4.0.5)
#>  DBI           1.1.1   2021-01-15 [1] CRAN (R 4.0.3)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr       * 1.0.6   2021-05-05 [1] CRAN (R 4.0.3)
#>  dtplyr      * 1.1.0   2021-02-20 [1] CRAN (R 4.0.5)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.0.5)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi         0.4.2   2021-01-15 [1] CRAN (R 4.0.3)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.0.5)
#>  htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.0.3)
#>  knitr         1.33    2021-04-24 [1] CRAN (R 4.0.5)
#>  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.4)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  pillar        1.6.0   2021-04-13 [1] CRAN (R 4.0.5)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  ps            1.6.0   2021-02-28 [1] CRAN (R 4.0.5)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  reprex        2.0.0   2021-04-02 [1] CRAN (R 4.0.5)
#>  rlang         0.4.11  2021-04-30 [1] CRAN (R 4.0.5)
#>  rmarkdown     2.8     2021-05-07 [1] CRAN (R 4.0.5)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.0.3)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
#>  stringi       1.6.1   2021-05-10 [1] CRAN (R 4.0.3)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler        1.4.1   2021-03-30 [1] CRAN (R 4.0.5)
#>  tibble        3.1.1   2021-04-18 [1] CRAN (R 4.0.5)
#>  tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.0.5)
#>  utf8          1.2.1   2021-03-12 [1] CRAN (R 4.0.5)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.0.5)
#>  withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.5)
#>  xfun          0.22    2021-03-11 [1] CRAN (R 4.0.5)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] C:/Users/sbashevkin/Documents/R/R-4.0.3/library
@mgirlich
Copy link
Collaborator

mgirlich commented Jul 2, 2021

I added support for the .groups argument in this PR.
Note that as_tibble() drops the grouping. This was a breaking change in tibble 2.0, see News. To keep the grouping you have to use collect().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants