Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression of column names in by: clarify the function of list and parentheses #2391

Open
Henrik-P opened this issue Sep 28, 2017 · 2 comments

Comments

@Henrik-P
Copy link

Henrik-P commented Sep 28, 2017

I'm slightly confused by the use of list (or not) in by.

From ?data.table:

by accepts a list() of expressions of column names

This might be nitpicking, but as long as you don't name your expression, the list doesn't seem to be needed. Unnamed expression in by works without list:

d <- data.table(x = 1:4, y = 2:5)
d[ , sum(y), by = x %% 2]
#    x V1
# 1: 1  6
# 2: 0  8

Furthermore, wrappin a named expression in parentheses instead of list works:

d[ , sum(y), by = (grp = x %% 2)]
#    grp V1
# 1:   1  6
# 2:   0  8

I haven't found anything in the docs on the use of ( in by. The description in Reference semantics, "e) Multiple columns and :=", seems like something else.


While being confused by the parentheses anyway, I just tried them in j:

d[ , sum_y = sum(y), by = (grp = x %% 2)]
# Error, fine.

# wrap j in parentheses -> no error, albeit naming of result fails
d[ , (sum_y = sum(y)), by = (grp = x %% 2)]
#    grp V1
# 1:   1  6
# 2:   0  8

At this point, I suppose I should provide some constructive comments on the docs, but first I want to check that I haven't overlooked something obvious here. Most likely I have.

Thanks for your great work!

@Henrik-P Henrik-P changed the title Expression of column names in by - clarify list requirement and the use of parentheses (in j as well) Expression of column names in by - clarify the function of list and parentheses (in j as well) Sep 28, 2017
@Henrik-P Henrik-P changed the title Expression of column names in by - clarify the function of list and parentheses (in j as well) Expression of column names in by: clarify the function of list and parentheses (in j as well) Sep 28, 2017
@Henrik-P Henrik-P changed the title Expression of column names in by: clarify the function of list and parentheses (in j as well) Expression of column names in by: clarify the function of list and parentheses Sep 28, 2017
@eantonya
Copy link
Contributor

eantonya commented Sep 28, 2017

There is nothing special about the (). The parens in R are a function (and so is the equality operator), that return a value, so it's as if you wrote d[, somefunction(y), by = someotherfunction(x)]. When you write (a = blah), that returns the result of the assignment, i.e. a - it's equivalent to writing

"("("="(a, blah))

to give you the expanded functional form.

@MichaelChirico
Copy link
Member

MichaelChirico commented Sep 28, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants