-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniqueN could supports list input #1224
Comments
@jangorecki this seems to do something different. dt = data.table(x=c(1,1,2,3), y=c(2,2,3,4))
nrow(unique(dt)) # [1] 3
length(unique(as.list(dt))) # [1] 2 It seems to not check row-wise, rather if any element of list is identical to another. |
I used library(data.table)
d1 <- data.table(l = list(list(letters[1:2]),list(Sys.time()),list(1:10),list(letters[1:2])))
d1[,length(unique(l))]
d1[,uniqueN(l)] |
Okay. I guess the idea here is that a list is a vector as well.. that's fine. d1 <- data.table(a = c(1:3,1), l = list(list(letters[1:2]),list(Sys.time()),list(1:10),list(letters[1:2])))
nrow(unique(as.data.frame(d1))) # [1] 3 We'd want that to work, if someone did: d1[, uniqueN(.SD)] This wouldn't work at the moment. Adding a check as to whether any column is of |
Maybe for single column case we can already use the fix you've provided, and take care of |
This should not affect data.table(a = list(1:2,4))[,is.data.frame(.SD)]
# [1] TRUE
data.table(a = list(1:2,4))[,uniqueN(a)]
# [1] 2
data.table(a = list(1:2,4))[,uniqueN(.SD)]
#Error in forderv(x, by = by, retGrp = TRUE) :
# First column being ordered is type 'list', not yet supported
data.table(aa = 1:2, a = list(1:2,4))[,uniqueN(.SD)]
# [1] 2 It still can throw error in a specific case of list column as first column, but this is related to #1229. |
uniqueN supports any types, solves #1224
This fix returns wrong result for |
@arunsrinivasan I'm not sure if I follow. You are saying that list should be handled same as data.table? I don't think it is a proper behavior for list, while it is for data.table. |
Nope. I'm comfortable with |
I will quote from source code, it is quite on point. # simple straightforward helper function to get the number
# of groups in a vector or data.table. Here by data.table,
# we really mean `.SD` - used in a grouping operation |
That was done before your update to lists. It just wasn't updated. What's your point? |
Will revisit later if the issue of lists come up. I'll revert this functionality to atomics / data.frames for now. |
@arunsrinivasan yes, just doing it now. |
A use case for this is in the syntax:
Could be handled with The latter in particular pokes at: I was also confused by the documentation, which has:
I was surprised to see a test for |
And not just lists but any type which is not supported by
uniqueN
could simply redirects tolength(unique(.))
.Looks pretty easy to do, I will try to make PR for that.
The text was updated successfully, but these errors were encountered: