-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having arg #4412
base: master
Are you sure you want to change the base?
Having arg #4412
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4412 +/- ##
==========================================
- Coverage 99.61% 99.49% -0.12%
==========================================
Files 72 72
Lines 13917 14047 +130
==========================================
+ Hits 13863 13976 +113
- Misses 54 71 +17
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks promising but I would wait for closing API discussions better before implementing
R/data.table.R
Outdated
if (is.atomic(e) || exists(as.character(e), env)) { | ||
ans = e | ||
} else { | ||
ans = eval(e, env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please put comments of an example e
in both cases
@jangorecki The |
Too early to say it addresses #788. Very much WIP. I am hoping for general feedback from @jangorecki before continuing this path.
Per the initial FR, this includes a new
having
argument that requires each group to return a logical vector of length one. Right now onlygForce
functions and primitive functions are allowed - I can work on PRs forany
,all
,which.max
, andwhich.min
gforce funs which would be helpful for this.New
vecseq_having
The subsetting workhorse is
vecseq_having
. The new function returns an integer vector with additional attributes ifretGrps
is true.New recursive parser
Current
GForce
optimizations go one deep. That is,mean(x)
will be optimized whilemean(x == 2L)
would not be. To account for this, a new function evaluates an expression to determine if it is agfun
,is.primitive
, aname
and whether it exists inside or outside of the environment. This allows formean(x ==2L)
to be optimized as well asmean(x) > 3 & .N > 5L
Performance
To do:
irows
subsetby
cols for correct order and any ad hoc columns.SD
in jrleid(x) < 5
which would evaluate to a logical vector equal to the number of rows in thedata.table
.