Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dumping R if using a serialized 'big.matrix' object #100

Open
HenrikBengtsson opened this issue Aug 15, 2020 · 10 comments
Open

Core dumping R if using a serialized 'big.matrix' object #100

HenrikBengtsson opened this issue Aug 15, 2020 · 10 comments

Comments

@HenrikBengtsson
Copy link

HenrikBengtsson commented Aug 15, 2020

Hi, I discovered that bigmemory core dumps (=crashes/terminates) parallel workers if attempting to use 'big.memory' objects. This appears to be because there is an assumption that the object is always used in the same R process that it was created in, which does not work because of the external pointer. Here is a minimal reproducible example:

In one R session, do:

X <- bigmemory::as.big.matrix(matrix(1:2, nrow = 2L))
saveRDS(X, "X.rds")
quit("no")

In another R session, do:

> library(bigmemory)
> X <- readRDS("X.rds")
> X[1,1]

 *** caught segfault ***
address 0x10, cause 'memory not mapped'

Traceback:
 1: CGetNrow(x@address)
 2: nrow(x)
 3: nrow(x)
 4: CCleanIndices(as.double(i), as.double(nrow(x)))
 5: GetElements.bm(x, i, j)
 6: .local(x, i, j, ..., drop)
 7: X[1, 1]
 8: X[1, 1]

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
R is aborting now ...
Segmentation fault (core dumped)
> library(bigmemory)
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /home/hb/software/R-devel/tags/R-4-0-2/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/tags/R-4-0-2/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bigmemory_4.5.36

loaded via a namespace (and not attached):
[1] bigmemory.sri_0.1.3 compiler_4.0.2      Rcpp_1.0.5  

Suggestion

Instead of core dumping, detect the problem and give an informative error message:

> library(bigmemory)
> X <- readRDS("X.rds")
> X[1,1]
Error: It is not possible to use a 'big.matrix' that was created in another R process

I don't know the internals, but I assume the problem is that the external pointer:

> X <- bigmemory::as.big.matrix(matrix(1:2, nrow = 2L))
> X@address
<pointer: 0x560457f389b0>

is used without making sure it is still valid.

PS. I consider this a quite serious bug since it can core dump R and parallel workers in R and it's hard to protect against it. People who run parallel code might not even know that bigmemory is used as part of some other package they rely on. This is the first package that I know of that use external pointers that also core dumps R, cf. https://cran.r-project.org/web/packages/future/vignettes/future-4-non-exportable-objects.html. It looks like those other packages can detect the problem and prevent core dumping, so, hopefully, it is not to hard to protect against this.

EDIT 2020-08-15 @ 18:11 UTC: Clarified that the bug fix should be to give an informative error message instead of core dumping.

@privefl
Copy link
Contributor

privefl commented Aug 15, 2020

I think this behaviour is fairly known, and this is why descriptors are used instead in parallel settings.

A cleaner way would be to use active bindings of reference classes as proposed in https://github.com/phaverty/bigmemoryExtras by @phaverty, which I've borrowed in bigstatsr.

@HenrikBengtsson
Copy link
Author

I realized my request for a bug fix was not clear. I'm not asking to make it possible to export bigmemory objects. The ask is to detect the problem and give an informative error message rather than core dumping. I've updated my top comment accordingly.

@HenrikBengtsson HenrikBengtsson changed the title Core dumping R if using a serialized big.memory object Core dumping R if using a serialized 'big.matrix' object Aug 15, 2020
@privefl
Copy link
Contributor

privefl commented Aug 15, 2020

Apparently, {magick} has a way to prevent from this:

library(magick)
tiger <- image_read_svg('http://jeroen.github.io/images/tiger.svg', width = 400)
saveRDS(tiger, tmp <- tempfile(fileext = ".rds"))
readRDS(tmp)

Error: Image pointer is dead. You cannot save or cache image objects between R sessions.

But I'm not sure how they manage to do this.

@privefl
Copy link
Contributor

privefl commented Aug 15, 2020

Maybe @jeroen can give a clue.

@jeroen
Copy link

jeroen commented Aug 15, 2020

You just need to check that your XPtr is not NULL, before using it, everywhere where you use:

Rcpp::XPtr<BigMatrix> pMat(bigMatAddr);

@privefl
Copy link
Contributor

privefl commented Aug 16, 2020

I've just tried checking for bigMatAddr == NULL, but it does not seem to do anything?

@jeroen
Copy link

jeroen commented Aug 16, 2020

That gives you the address of the SEXP object itself. To get the externalpointer, I think you need pMat.get() or just *pMat may work as well.

You can also use BigMatrix* mat = pMat.checked_get() which is designed exactly for this cause: it will raise an error if the pointer is NULL.

@privefl
Copy link
Contributor

privefl commented Aug 16, 2020

That is very useful information, thanks!

@privefl
Copy link
Contributor

privefl commented Aug 16, 2020

I've tried using BigMatrix * pMat = (as< XPtr<BigMatrix> >(bigMatAddr)).checked_get(); instead of Rcpp::XPtr<BigMatrix> pMat(bigMatAddr);.
It does not seem to prevent from crashing.

From what I see from {magick}, you do assert_image() in R, not C++.
I still believe that the active binding is the way to go to solve this problem, as it prevents from having to check every single R function.

As {bigmemory} is not using an RC object (but S4), I'm not sure it can use active bindings.
I just wonder if we could just wrap the externalptr class within another one doing the protection.

@jeroen
Copy link

jeroen commented Aug 16, 2020

All that the assert_image function in magick does is call magick_image_dead which checks if the pointer is NULL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants