Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

by= ignored if there is no j expression #3263

Closed
st-pasha opened this issue Jan 9, 2019 · 8 comments
Closed

by= ignored if there is no j expression #3263

st-pasha opened this issue Jan 9, 2019 · 8 comments
Labels

Comments

@st-pasha
Copy link
Contributor

st-pasha commented Jan 9, 2019

> DT = data.table(A=c(1,2,1,2,1,2), B=c(1,2,1,1,2,2))
> DT[, , by=A+B]  # result ignores the by= clause
   A B
1: 1 1
2: 2 2
3: 1 1
4: 2 1
5: 1 2
6: 2 2
> DT[, .(A, B), by=A+B]  # this is correct, DT[,,by=A+B] expected to be same as this
   A + B A B
1:     2 1 1
2:     2 1 1
3:     4 2 2
4:     4 2 2
5:     3 2 1
6:     3 1 2
@st-pasha st-pasha added the bug label Jan 9, 2019
@franknarf1
Copy link
Contributor

Related #1269 (comment) and later comments

@MichaelChirico
Copy link
Member

Not sure I follow your reasoning for your second example... is that for convenience? easy to imagine I have some complicated expression by = f(x1, .., xn) and don't really care about the inputs to f.

@st-pasha
Copy link
Contributor Author

One could argue whether DT[,,B] makes sense or not.
If it doesn't, then an error should be thrown.
If it does, then the result should be the DT reordered in such way that the values in column B are grouped together. And we should also decide whether B ought to be moved to the front or kept in its place, or both.

However, I don't think silently ignoring the by argument is the right thing to do...

@MichaelChirico
Copy link
Member

MichaelChirico commented Jan 10, 2019 via email

@st-pasha
Copy link
Contributor Author

But the second example already does the right thing: DT[, .(A,B), by=A+B] means group the data.table by expression A+B, then select columns A and B. And that's exactly what it does, also adding the group-by column A+B at the front.

@mattdowle
Copy link
Member

Please use dev not release when reporting bugs.

> DT = data.table(A=c(1,2,1,2,1,2), B=c(1,2,1,1,2,2))
> DT[, , by=A+B] 
...
Warning message:
In `[.data.table`(DT, , , by = A + B) :
  i and j are both missing so ignoring the other arguments
>

@MichaelChirico
Copy link
Member

@st-pasha sorry I see the confusion, I was referring to your comment:

this is correct, DT[,,by=A+B] expected to be same as this

@mattdowle
Copy link
Member

Closing because the warning is there and that was the focus of this issue.
Can follow up in #3262 perhaps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants