Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data.table issues with reference counting #4093

Closed
mattdowle opened this issue Dec 7, 2019 · 1 comment · Fixed by #4095
Closed

data.table issues with reference counting #4093

mattdowle opened this issue Dec 7, 2019 · 1 comment · Fixed by #4095
Milestone

Comments

@mattdowle
Copy link
Member

mattdowle commented Dec 7, 2019

Hi Matt,

I spent an entire day tracking down an issue that turns out to be a
problem in the data.table internals. Here is a simple reproducible
example:

library(data.table)
d <- data.table(x = c(1, 2), y = c(3, 4), z <- c(5, 6))
d[, x := NULL]
d[, y := NULL]
## .Internal(inspect(d, 1))
## @3234b40 19 VECSXP g0c7 [OBJ,REF(13),ATT] (len=1, tl=1027)
##   @354cde8 14 REALSXP g0c2 [] (len=2, tl=0) 5,6
## ...                         ^
## ...                         |_ zero refcount

After deletion of two leading columns the remaining column has
reference count zero. The reason is that in assign.c you use memmove
on the DATAPTR to shift remaining columns, which does not adjust
reference counts, and then use SET_VECTOR_ELT to replace a shifted
column by R_NilValue, which decrements the reference count the moved
column. You need to re-think this to work with reference counting.
Only modifying vector content with SET_VECTOR ELT is the safe and only
supported approach. You may need to make similar adjustments in other
places.

Best,
Luke

@mattdowle mattdowle added this to the 1.12.7 milestone Dec 7, 2019
@mattdowle
Copy link
Member Author

mattdowle commented Dec 7, 2019

Sadly, although good progress was made on #3301, it wasn't finished (assign.c was known and on the list not yet ticked). Otherwise that would have caught this. When this is complete then, #3301 should be finished and that should catch any remaining via no longer using DATAPTR and not using USE_RINTERNALS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant