Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

g[ error with by=rleid and j=.(.I[1L], v1[1L]) #1683

Closed
franknarf1 opened this issue Apr 28, 2016 · 3 comments
Closed

g[ error with by=rleid and j=.(.I[1L], v1[1L]) #1683

franknarf1 opened this issue Apr 28, 2016 · 3 comments

Comments

@franknarf1
Copy link
Contributor

Ran into this on SO:

DT = setDT(structure(list(Name = c("John", "John", "John", "John", "John", 
"John", "John", "John", "Tom", "Tom", "Tom", "Tom", "Tom", "Tom", 
"Tom"), Level = c(1L, 1L, 2L, 2L, 3L, 4L, 4L, 7L, 1L, 2L, 2L, 
3L, 4L, 4L, 7L), Date = structure(c(16801L, 16810L, 16817L, 16818L, 
16822L, 16826L, 16827L, 16829L, 16810L, 16817L, 16818L, 16822L, 
16826L, 16827L, 16829L), class = c("IDate", "Date"))), .Names = c("Name", 
"Level", "Date"), class = "data.frame", row.names = c(NA, -15L
)))

#1
DT[, .(.I[1L], Date[1L]), by=.(Name, rleid(Level))]
# Error in `g[`(.I, 1L) : grpn [15] != length(x) [0] in ghead

#2
DT[, {.(.I[1L], Date[1L])}, by=.(Name, rleid(Level))]
# Error in `g[`(.I, 1L) : grpn [15] != length(x) [0] in ghead

#3
DT[, {V1 = .I[1L]; V2 = Date[1L]; .(V1, V2)}, by=.(Name, rleid(Level))]
# (expected output)

I tested on an earlier version of 1.9.7 and the second approach worked. After upgrading, only the third one worked, so I guess it is thanks to a recent change.

This is the first time I've seen data.table:::g[`` or any function named like that.

@MichaelChirico
Copy link
Member

MichaelChirico commented Apr 28, 2016

Definitely coming from GForce, see:

DT[, .(.I[1L], Date[1L]), by=.(Name, rleid(Level)), verbose = TRUE]

Detected that j uses these columns: Date
Finding groups (bysameorder=FALSE) ... done in 0secs. bysameorder=TRUE and o__ is length 0
lapply optimization is on, j unchanged as 'list(.I[1L], Date[1L])'
GForce optimized j to 'list(g[(.I, 1L), g[(Date, 1L))'

No error with DT[ , .(Date[1L]), by = .(Name, rleid(Level))], nor from DT[ , .(I[1L]), by = .(Name, releid(Level))].

From examining the verbose output, it seems like GForce is turned on for when j = .(Date[1L]), but not for when j = .(I[1L]).

Going a bit further, the line in [.data.table that causes the error is ans = eval(jsub, thisEnv), where jsub is substitute(list(g[(.I, 1L),g[(Date, 1L))). But thisEnv doesn't have the variable .I in it, probably for the same reason GForce isn't activated when we just use .I by itself.

Question is, why isn't GForce activated for DT[ , .I[1L], by = V1], but is for DT[ , .(.I[1L], Date[1L]), by = V1]?

arunsrinivasan added a commit that referenced this issue May 3, 2016
Closes #1683, .I[1L] is optimised for GForce.
@arunsrinivasan arunsrinivasan added this to the v1.9.8 milestone May 3, 2016
@arunsrinivasan
Copy link
Member

Thanks @MichaelChirico for the PR.

@MichaelChirico
Copy link
Member

thanks @franknarf1! would never have noticed .I isn't working with GForce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants