Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault unlisting a nested data.frame #2936

Closed
PavoDive opened this issue Jun 16, 2018 · 4 comments · Fixed by #3770
Closed

segfault unlisting a nested data.frame #2936

PavoDive opened this issue Jun 16, 2018 · 4 comments · Fixed by #3770

Comments

@PavoDive
Copy link

I was trying to provide a data.table solution to this SO question: https://stackoverflow.com/questions/50881925/r-expand-nested-dataframe-into-parent but instead caused a segfault that forced me to exit R altogether.

Reproducible data and actions:

id <- c(1551, 1033, 1061, 1262, 1032, 1896, 1080, 1099, 1679, 1690)
fname <- list("Jack","Yogesh","Steven","Richard","Thomas","Craig","David","Aman","Frank","Robert")
mname <- list("B",NULL,"J","I","E","A","R",NULL,"J","E")
 
sub <- as.data.frame(cbind(fname, mname))
master <- as.data.frame(id)
master$personalInfo <- sub
setDT(master)

And the action that caused the segfault:

master[, unlist(personalInfo), by = id]

# Output of sessionInfo()

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.7.0
LAPACK: /usr/lib/lapack/liblapack.so.3.7.0

locale:
 [1] LC_CTYPE=es_CO.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=es_CO.UTF-8        LC_COLLATE=es_CO.UTF-8    
 [5] LC_MONETARY=es_CO.UTF-8    LC_MESSAGES=es_CO.UTF-8   
 [7] LC_PAPER=es_CO.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_CO.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.7.4   data.table_1.11.2

loaded via a namespace (and not attached):
[1] compiler_3.4.4 magrittr_1.5   tools_3.4.4    Rcpp_0.12.16   stringi_1.2.2 
[6] stringr_1.3.1 

@jangorecki jangorecki added this to the 1.11.6 milestone Jun 16, 2018
@jangorecki
Copy link
Member

jangorecki commented Jun 16, 2018

thanks for report, reproduced on 1.11.5

@PavoDive
Copy link
Author

Thanks @jangorecki . I really don't know if it's a valid expectation for a data.table to work with nested data frames. When I try to view the data.table, it gives a gentle error:

> master
Error in FUN(X[[i]], ...) : 
  Invalid column: it has dimensions. Can't format it. If it's the result of data.table(table()), use as.data.table(table()) instead.

perhaps it'd be better if it complained at the time of setting it to a DT? setDT()?

@MichaelChirico
Copy link
Member

I've been thinking we should fix that "Invalid column: it has dimensions" error...

it's not very helpful (doesn't name which column, and never have I encountered it as a "result of data.table(table())" -- usually it's because a data.frame had a data.frame as a column. we could offer better alternatives for the most common cases...

@mattdowle mattdowle modified the milestones: 1.11.6, 1.12.0 Sep 20, 2018
@mattdowle mattdowle modified the milestones: 1.12.0, 1.12.2 Jan 11, 2019
@jangorecki jangorecki modified the milestones: 1.12.2, 1.12.4 Jan 24, 2019
@MichaelChirico
Copy link
Member

MichaelChirico commented Aug 24, 2019

Not a segfault anymore... #3770 would have given the error at setDT. Will tag this to close with that PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants