-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] lagging of lists with shift() #1595
Comments
MRE with expected output please. Hard to follow. But I'm not sure what's unclear in the documentation. The Perhaps your use case is more relevant with list-of-lists? It's not a common usage for Since I don't understand what / why this is needed, I'm reluctant to add support for list-of-list types. A MRE would help. Feel free to reopen after. |
@arunsrinivasan OP shows their desired output on SO (not sure why they didn't link it): http://stackoverflow.com/a/36041367/1191259 If I understand correctly, the short version is:
Personally, I don't really need it. I try not to do anything fancy with list columns. |
Thanks @franknarf1. MRE is: require(data.table)
dt = data.table(x=1:2, y=list(3:4, 5:6))
dt[, z := shift(y)] # op expects dt[, z := list(list(NA, 3:4))] Is that right? |
Yep, that's my understanding. |
|
How about:
Seems to work fine with groups as well. dt=data.table(x=c(1,1,2), y=list(1:2,3:4,5:6))
dt[, z := .(y[shift(.I)]), by=x]
# x y z
# 1: 1 1,2 NULL
# 2: 1 3,4 1,2
# 3: 2 5,6 NULL |
Yeah, that's a good idiom, I think. I think the documentation is fine; probably no need to tag this FR with it. |
I think the doc is quite clear. And this is not an intended use case for |
Added list-of-list support. |
Actually... does this work ?
edit: this seems to work then
|
On Oct 1, 2016 11:07 AM, "statquant" notifications@github.com wrote:
|
The way
shift()
works on lists, it can't be used to created lagged columns of type list the same way that it can create lagged columns of other types. Example:from SO
It would be useful if
shift()
treated lists the same. The current behavior is technically documented —?shift
says how it works on lists — but the docs describe that behavior outside the context ofdata.table
, and the implication for within-data.table
behavior isn't clear into you stumble into it. In the context ofdata.table
, where it functions as a really nice way to lag/lead columns by group,shift
's behavior with lists seems inconsistent with how it works on other types.General use case: Being able to compare a set to itself over time
Example specific use case: Say I have a table of all of the people who have come to my birthday each year, one row per name per year. If I wanted to see how many people came this year that didn't come last year, and how many came last year that didn't come this year, and the people who come every year, and all the people who have ever come, I'd be able to do it all at once with something like this pseudocode:
The text was updated successfully, but these errors were encountered: