-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug report - RSession Hangs #1470
Comments
I cannot reproduce the hang on Ubuntu R 3.2.3 and 1.9.7. |
Hm, the performance hit seems to be due to optimising Seems like having a lot of columns in options(datatable.optimize=0L) # without optimisation
system.time(DT[,.SD,by=st])
# user system elapsed
# 0.481 0.012 0.502
options(datatable.optimize=Inf) # with optimisation
system.time(DT[,.SD,by=st])
# user system elapsed
# 53.125 8.002 61.784 Can't reproduce the session hang. |
@Jorges1000 is this still a problem in the latest releases? |
not sure this line with
|
Looks like it might be > mt = rep(rownames(mtcars)[1:25],20)
> st = rep(state.name,10)
> DT = data.table(mt=mt, st=st, matrix(sample(1:(30000L*500),30000*500,replace=T),
nrow=500,ncol=30000), key='mt')
> options(datatable.optimize=0L)
> system.time(DT[,.SD,by=st])
user system elapsed
0.512 0.012 0.367
> options(datatable.optimize=Inf)
> system.time(DT[,.SD,by=st])
user system elapsed
25.083 3.157 28.107
> Rprof()
> system.time(DT[,.SD,by=st])
user system elapsed
24.321 2.708 26.897
> Rprof(NULL)
> summaryRprof()
$by.self
self.time self.pct total.time total.pct
"[.data.table" 13.88 51.26 27.02 99.78
"dotN" 13.12 48.45 13.12 48.45
"gc" 0.06 0.22 0.06 0.22
"c" 0.02 0.07 0.02 0.07 @MichaelChirico commented here, that call to |
Rsession hangs
When a data.table with large numbers of columns is queried using .SD, first this takes much longer than just creating the DT (from about a minute to nearly 10 minutes), then after a while R starts running in the background for large period of time (5-10 minutes) even without any command. We can see on the Activity Monitor that the rsession process is on at 100% and RStudio unresponsive. Note that R library is in a custom folder and this happens more often if many queries are done on DT. Tried turning off options(datatable.auto.index=FALSE) to no avail.
Using the latest versions of RStudio (0.99.489), R (3.2.3), and data.table (1.9.6) under OS X 10.9.5 (Mavericks) on x86_64-apple-darwin13.4.0 (64-bit). attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] data.table_1.9.6 microbenchmark_1.4-2.1
loaded via a namespace (and not attached): Rcpp_0.12.2 digest_0.6.8 MASS_7.3-45 chron_2.3-47 grid_3.2.3 plyr_1.8.3 gtable_0.1.2 magrittr_1.5 scales_0.3.0 ggplot2_1.0.1 stringi_1.0-1 reshape2_1.4.1 proto_0.3-10 tools_3.2.3 stringr_1.0.0 munsell_0.4.2 colorspace_1.2-6
The text was updated successfully, but these errors were encountered: