Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using by instead of keyby in grouped operations #136

Merged
merged 11 commits into from
Dec 24, 2019

Conversation

psanker
Copy link
Contributor

@psanker psanker commented Dec 8, 2019

Addresses #85

This amends how keyby is added to the translated calls by creating a separate structure for by/keyby and a linking function to link the object to the call. It's a bit more flexible than checking if there are groups and then adding them. Users are able to access by= by setting key = FALSE in group_by():

lazy_dt(mtcars, "mt") %>%
  group_by(cyl, key = FALSE) %>%
  summarize(mean_mpg = mean(mpg)) %>%
  show_query()

# mt[, .(mean_mpg = mean(mpg)), by = .(cyl)]

I'm still not certain about the term key for the parameter name since it could be confused with key_by. Maybe it could be something like use_by to be explicit?

Originally the first parameter was "step", but since the first parameter
should be a call, it didn't make sense to have it named "step".
Removed commented code and updated translation vignette

Fix silly typo

Update NEWS.md with key parameter

Renamed link_by_struct parameter for clarity

Originally the first parameter was "step", but since the first parameter
should be a call, it didn't make sense to have it named "step".

Renamed 'with' to 'using'

with had a somewhat clear meaning, but using makes the most sense
("using by" or "using keyby").
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — this looks like a great start!

NEWS.md Outdated Show resolved Hide resolved
R/step-group.R Outdated Show resolved Hide resolved
∆ `group_by` parameter `key` is now `arrange` to follow dplyr naming
conventions.
∆ Internally, `keyed` is now `keyby` to be explicit about
which data.table functionality it is associated
∆ The `by_struct` system is now just a function that modifies a call.
Currently accepts three variables (call, groups, and keyby), but
probably could be reduced to the call and current step.
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a few more smaller comments

R/step-group.R Outdated Show resolved Hide resolved
vignettes/translation.Rmd Outdated Show resolved Hide resolved
R/step-group.R Outdated Show resolved Hide resolved
psanker and others added 2 commits December 23, 2019 17:42
∆ The `keyby` parameter in the step objects is now `arrange` for
consistency
∆ `add_grouping_parameter` renamed to `add_grouping_param` and now has
two arguments: a call and a step
∆ Updated translation vignette
@hadley hadley merged commit d533983 into tidyverse:master Dec 24, 2019
@hadley
Copy link
Member

hadley commented Dec 24, 2019

Thanks!

@GitHunter0
Copy link

@psanker , thank you, this was a good issue to solve.

The only problem left is that dplyr::group_by() does not have the parameter key, which makes dtplyr and dplyr code not perfectly interchangeable...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants