-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error using function setkey #3496
Comments
I can reproduce this with data.table 1.12.0 on both Windows 10 and macOS 10.13.6. After upgrading my macOS system tot data.table 1.12.2 I can confirm that all examples below give the same result. A somewhat simpler example:
This happens also when you assign the key directly with
As you can see the order of the 1) not copying a dataframe column
2) creating an arbitrary new column in the dataframe
3) creating a new column by sampling an existing column and pasting some extr charachters to it
4) creating a new column by just sampling an existing column
5) copying a column the data.table way
Although this bug seems to occur in a (very) specific usecase, this might be critical for me because I use |
Confirming @jaapwalhout's observations (with data.table_1.12.3). options(datatable.verbose = TRUE)
## KO
d <- data.frame(x = c(9, 1), y = c(9, 1))
d$x2 <- d$x
d
# x y x2
# 1 9 9 9
# 2 1 1 1
setDT(d, key = "x")[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 9 1 9
# 2: 1 9 1
## KO
d <- data.frame(x = c("9", "1"), y = c(9, 1), stringsAsFactors = FALSE)
d$x2 <- d$x
d
# x y x2
# 1 9 9 9
# 2 1 1 1
setDT(d, key = "x")[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 9 1 9
# 2: 1 9 1
## OK (with warning)
d <- data.frame(x = c("9", "1"), y = c(9, 1))
setDT(d)
d$x2 <- d$x
# Assigning to all 2 rows
# RHS for item 1 has been duplicated because NAMED is 2, but then is being plonked. length(values)==2; length(cols)==1)
setkey(d, y, verbose = TRUE)[]
# forder took 0 sec
# reorder took 0 sec
# x y x2
# 1: 1 1 1
# 2: 9 9 9 |
iiuc, when duplicating a column or list element without modification, the memory address will be the same and setting the key will "mess things up". # KO
l <- list(x = c(9, 1), y = c(9, 1))
l[["z"]] <- l[["x"]]
l
setDT(l, key = "x")[]
address(l$x) == address(l$z)
# TRUE
# OK
l <- list(x = c(9, 1), y = c(9, 1))
l[["z"]] <- copy(l[["x"]])
l
setDT(l, key = "x")[] |
I know how to avoid this error, but I donot know why this error happen if I run it in this way.
Thanks!
Alex
The text was updated successfully, but these errors were encountered: