Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spatial coordinates to Anndata when converting from SpatialExperiment #138

Open
wants to merge 21 commits into
base: devel
Choose a base branch
from

Conversation

mcmero
Copy link

@mcmero mcmero commented Feb 27, 2025

The aim is to add the spatial coordinates to adata.obsm["spatial"] when converting SpatialExperiment objects via zellkonverter::writeH5AD. Please see this issue for more context.

@lazappi, you mentioned moving this to the R side and leaving the conversion code, so I've done this by adding the coordinates to reducedDims as a workaround.

@lazappi
Copy link
Member

lazappi commented Feb 27, 2025

Thanks! I will probably suggest some tweaks to the code when I have a chance to properly look at it but for now it would be great to add a couple of tests. Probably by finding a test SpatialExperiment that is easy to load.

@mcmero
Copy link
Author

mcmero commented Mar 3, 2025

Great, I've added some tests using the example Visium data used by SpatialExperiment.

@lazappi
Copy link
Member

lazappi commented Mar 8, 2025

@mcmero I have pushed some commits with some adjustments, I thought that would be quicker than giving you comments. I hope that is ok. Could you please check you are still happy with everything (and maybe test it with any objects you have)?

Also, please add yourself as a contributor to the DESCRIPTION if you like.

mcmero added 2 commits March 11, 2025 11:33
If coords has colnames, this causes isses with rd$set_axis(colnames(sce))
@mcmero
Copy link
Author

mcmero commented Mar 11, 2025

Thanks @lazappi, looks good! I've tested with some of my data and found that the column names on the spatialCoords object were causing issues (see error below), so I've pushed a commit to set them to NULL before adding them to reducedDim.

── Python Exception Message ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Traceback (most recent call last):
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/frame.py", line 5357, in set_axis
    return super().set_axis(labels, axis=axis, copy=copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/generic.py", line 792, in set_axis
    return self._set_axis_nocheck(labels, axis, inplace=False, copy=copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/generic.py", line 804, in _set_axis_nocheck
    setattr(obj, obj._get_axis_name(axis), labels)
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/generic.py", line 6313, in __setattr__
    return object.__setattr__(self, name, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/generic.py", line 813, in _set_axis
    labels = ensure_index(labels)
             ^^^^^^^^^^^^^^^^^^^^
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 7649, in ensure_index
    return Index(index_like, copy=copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 526, in __new__
    raise cls._raise_scalar_data_error(data)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cmero.ma/.config/cache/R/basilisk/1.18.0/zellkonverter/1.17.0/zellkonverterAnnDataEnv-0.10.9/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 5289, in _raise_scalar_data_error
    raise TypeError(
TypeError: Index(...) must be called with a collection of some kind, None was passed

── R Traceback ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    ▆
 1. └─zellkonverter::writeH5AD(spe, "test.H5AD")
 2.   └─basilisk::basiliskRun(...)
 3.     └─zellkonverter (local) fun(...)
 4.       └─zellkonverter::SCE2AnnData(...)
 5.         └─base::lapply(...)
 6.           └─zellkonverter (local) FUN(X[[i]], ...)
 7.             └─rd$set_axis(colnames(sce))
 8.               └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named)
See `reticulate::py_last_error()$r_trace$full_call` for more details.
> reticulate::py_last_error()$r_trace$full_call
[[1]]
writeH5AD(spe, "test.H5AD")

[[2]]
basiliskRun(env = env, fun = .H5ADwriter, sce = sce, file = file,
    X_name = X_name, skip_assays = skip_assays, compression = compression,
    verbose = verbose, ...)

[[3]]
fun(...)

[[4]]
SCE2AnnData(sce, X_name = X_name, skip_assays = skip_assays,
    verbose = verbose, ...)

[[5]]
lapply(red_dims, function(rd) {
    if (!is.null(colnames(rd))) {
        rd <- r_to_py(as.data.frame(rd))
        rd <- rd$set_axis(colnames(sce))
    }
    rd
})

[[6]]
FUN(X[[i]], ...)

[[7]]
rd$set_axis(colnames(sce))

[[8]]
py_call_impl(callable, call_args$unnamed, call_args$named)

lazappi added 3 commits March 13, 2025 17:22

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* origin/devel:
  Update when GHA actions are run
  Update GitHub Actions bot user config
  Move set git credentials step in pkgdown job
* origin/devel:
  Fix handling of missing rowData/colData
R/write.R Outdated
Comment on lines 109 to 116
# If converting SpatialExperiment object, add spatial coords to reducedDims
if (inherits(sce, "SpatialExperiment")) {
coords <- SpatialExperiment::spatialCoords(sce)
if (ncol(coords) > 1) {
colnames(coords) <- NULL
SingleCellExperiment::reducedDim(sce, "spatial") <- coords
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realised this should probably be moved to SCE2AnnData() so it works if someone calls that directly. The documentation can probably also do there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've pushed a commit with this change.

@lazappi
Copy link
Member

lazappi commented Mar 16, 2025

Thanks @lazappi, looks good! I've tested with some of my data and found that the column names on the spatialCoords object were causing issues (see error below), so I've pushed a commit to set them to NULL before adding them to reducedDim.

I think we would prefer to avoid losing the column names if possible. It would be helpful to now what the colnames of the spatial coords, what the colnames of the object are and the class of the spatial coords.

Also, I fixed a bug that might affect this so if you could check again that would be good. Thanks!

@mcmero
Copy link
Author

mcmero commented Mar 17, 2025

Makes sense. The error I was getting with my spe objects was likely due to missing colnames, so I made sure to add them to my test data from the Cell IDs. It's working correctly now, it just changes the adata.obsm["spatial"] type to a pandas DataFrame, but this can easily be converted to a numpy array if need be.

@lazappi
Copy link
Member

lazappi commented Mar 19, 2025

I added a test for coords without names which seems to work. The conversion to a DataFrame happens because you can't add column names to an array, so to keep it uses a DataFrame instead when needed. I'm not sure there is an alternative other than losing the names but if you think it's important then open an issue about that.

If you are happy with this now I think it can be merged.

@mcmero
Copy link
Author

mcmero commented Mar 20, 2025

That sounds good. With the colnames I meant colnames(spe), so for example:

example(read10xVisium, echo = FALSE)
colnames(spe) <- NULL
rownames(spatialCoords(spe)) <- NULL
writeH5AD(spe, "test.H5AD")

Throws the error I posted above.

It might be helpful to improve this error message, so I've added an error and test in the latest commit. It may be out of scope for this PR however so feel free to revert.

@lazappi
Copy link
Member

lazappi commented Mar 20, 2025

Ok, that ended up being trickier than I expected but the issue with missing names should be fixed. If the checks pass I think we should be about done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants