-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-Forge #1841] Implement a fast droplevels.data.table method. #647
Comments
Code in the above post is not reproducible any more. Regarding data.table method... I've made one based on droplevels.data.frame method. it is not a fast one, as the library(data.table)
droplevels.data.table <- function(x, except = NULL, in.place = FALSE, ...){
stopifnot(length(x) > 0L, is.logical(in.place))
ix = vapply(x, is.factor, NA)
if(!is.null(except)){
stopifnot(is.numeric(except), except <= length(x))
ix[except] = FALSE
}
if(!sum(ix)) return(x)
if(!in.place) x = copy(x)
for(nx in names(ix)[ix==TRUE]){
set(x, i = NULL, j = nx, value = factor(x[[nx]]))
}
return(x)
} update: Confirmed by @arunsrinivasan it can be improved with internal fast |
I think we could cut this even further by not using
|
@ben-schwen amazing! That could be proposed to R Core, right? With only slight modifications to use |
@ColeMiller1 we could, but I'm not convinced it has a good chance to really make it into |
Submitted by: Steve Lianoglou; Assigned to: Nobody; R-Forge link
The default droplevels.data.frame function is invoked on a call to droplevels on a data.table. This results in borken output from droplevels, eg. as reported by pchalasani on the ML:
d <- data.table(name = c('a','b','c'), value = 1:3)
dt <- data.table(d)
setkey(dt,'name')
dt1 <- subset(dt,name != 'a') # or dt1 <- dt[ name != 'a' ]
The text was updated successfully, but these errors were encountered: