-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] [r] SOMASparseNDArray$read_sparse_matrix_zero_based()
raises error when calling from a sparse matrix created by python's from_anndata
#1348
Comments
SOMASparseNDArray$read_sparse_matrix_zero_based()
raises error when calling from a sparse matrix create by python's from_anndata
SOMASparseNDArray$read_sparse_matrix_zero_based()
raises error when calling from a sparse matrix created by python's from_anndata
@pablo-gar @eddelbuettel would a reasonable solution here be to use |
I am wondering if there is a better way to not drop the range. When I use the > uri <- "apis/python/notebooks/data/sparse/pbmc3k/ms/RNA/X/data"
> lapply(dimensions(domain(schema(uri))), \(x) domain(x))
[[1]]
integer64
[1] 0 9223372036854773759
[[2]]
integer64
[1] 0 9223372036854773759
> |
Ah darn, and I fell again for the
Given that your suggested fix is likely our best bet to create a create sparse matrix object of this type. I think @mojaveazure had a times looked at packages |
Not a 100% sure if this is sustainable solution, in theory a SOMASparseNDarray could have dimensions (not capacity) that are larger than I think we should probably raise a clear error indicating that such matrices are not supported -- but the question is, how to efficiently get the actual dimension (not capacity). Let me do a bit experimentation on this. |
The biggest reason is that spam/spam64 are not used anywhere; pretty much all downstream analysis methods are implemented for That being said, we could suggest spam/spam64 and if they're installed, allow the user to read in a matrix as a spam/spam64 matrix (I can't remember if it works with spam64 loaded or if it has to be attached) Also note: we'll get these errors if either the dimensions of the matrix are greater than |
Yikes...I was ready to argue that limiting the dimensions to 2^31 is fine for now, but, I can see the limit on individual cells pinching much sooner. |
I experimented with this a little bit and it is not a feasible solution, somehow the Matrix package hits this error after creating 2 dgCMatrices with a small number of non-zero values but with shapes 2^31 by 2^31 > sparse <- x_sparse$read_sparse_matrix_zero_based(coords = list(1:10,1:10))
Error: vector memory exhausted (limit reached?) |
OK. I originally filed #1099 (to set the sparseMatrix dims to the SOMASparseNDArray shape) while porting one of the Python notebooks that reported the dims of the numpy object, in trying to port that as 1:1 possible. Since this is now causing significant problems, I don't see big harm in reverting that so that the sparseMatrix dims will be the maximum row/column actually present (and documenting that this is so). The "vector memory exhausted" is disturbing though, like not consistent with the main idea of a sparse data structure? It sounds like in the future we may well need |
Describe the bug
Hey @mlin doing some work with the R iterators I think I have uncovered an unexpected issue for
SOMASparseNDArray$read_sparse_matrix_zero_based()
I can see that in your tests you used a mock
SOMASparseNDArray
with a shape c(10,10)TileDB-SOMA/apis/r/tests/testthat/test-SOMASparseNDArray.R
Line 4 in d8a6fc9
However, when building a
SOMAExperiment
from python using thefrom_anndata
method we end up withSOMASparseNDArray
that is of shape9223372036854773760
by9223372036854773760
, because of this#1327 (comment)
Then if I try to read the
X
sparse matrix of theSOMAExperiment
created withfrom_anndata
you hit a numeric overflow issue because the way we use shape to create theMatrix::SparseMatrix
.The actual error is from this
TileDB-SOMA/apis/r/R/SOMASparseNDArray.R
Line 203 in d8a6fc9
Which triggers this
https://github.com/cran/Matrix/blob/d35b9ca2e877a6fc55dde6ad7c9ccfc0b35624d9/R/sparseMatrix.R#L426
To Reproduce
The text was updated successfully, but these errors were encountered: