Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning: Error in GetAssayData: GetAssayData doesn't work for multiple layers in v5 assay. #31

Open
hongjianjin opened this issue Aug 31, 2023 · 12 comments

Comments

@hongjianjin
Copy link

Would you plan to make AddModuleScore_UCell function compatible with Seurat V5 in near future ? Thanks!

@mass-a
Copy link
Member

mass-a commented Sep 27, 2023

Yes we definitely plan to support Seurat v5 objects.
We will hopefully get around to doing this soon, will post here when it's available.
-m

@mass-a
Copy link
Member

mass-a commented Oct 31, 2023

Hello,
the latest version of UCell (v2.6) available with Bioconductor 3.18 should be fully compatible with Seurat 5.
Let us know if you still experience compatibility issues.
Best
-m

@bepoli
Copy link

bepoli commented Nov 24, 2023

Hi @mass-a, I still get an error with UCell 2.6.2 and Seurat 5.0.1:

obj <- AddModuleScore_UCell(obj, features = list('module_name' = my_genes))
Error in `GetAssayData()` at SeuratObject/R/seurat.R:1901:3:
! GetAssayData doesn't work for multiple layers in v5 assay.

It does work only after I join the layers: obj <- JoinLayers(obj), so as a workaround I'm joining and re-splitting the object as needed.

@mass-a
Copy link
Member

mass-a commented Nov 27, 2023

Thanks @bepoli! I think running UCell on joined layers (or before you split them out) is the best approach for now; we'll work on a solution for objects split on multiple layers.
Cheers

@frac2738
Copy link

Thanks @bepoli! I think running UCell on joined layers (or before you split them out) is the best approach for now; we'll work on a solution for objects split on multiple layers. Cheers

Do you have a timeline for this feature to be implemented?
I am working on a 1.2 Mil cell dataset and I cannot JoinLayers because of the usual memory limitations.
Being able to run UCell on the splitted layers would be really useful.

Thanks for the amazing job you are doing.

@mass-a
Copy link
Member

mass-a commented Feb 21, 2024

I *think* the latest version on GitHub (2.7.3) should take care of Seurat in multiple layers.
Can you try to install from github (remotes::install_github("carmonalab/UCell") and see whether it works for you?

@frac2738
Copy link

frac2738 commented Feb 21, 2024

v2.7.3 doesn't work either.
I also tried with the scale.data slot (which is not layered), but I get the usual matrix size error.

 # data slot (multiple layers)
 > exp_Integrated <- AddModuleScore_UCell(exp_Integrated,signatures_list,ncores = 2,name = "", slot = "data")
 Error in `GetAssayData()`:
 ! GetAssayData doesn't work for multiple layers in v5 assay.
 Run `rlang::last_trace()` to see where the error occurred.

 # counts slot (multiple layers)
 > exp_Integrated <- AddModuleScore_UCell(exp_Integrated,signatures_list,ncores = 2,name = "", slot = "counts")
 Error in `GetAssayData()`:
 ! GetAssayData doesn't work for multiple layers in v5 assay.
 Run `rlang::last_trace()` to see where the error occurred.

 # scale.data slot (single layer)
 > exp_Integrated <- AddModuleScore_UCell(exp_Integrated,signatures_list,ncores = 2,name = "", slot = "scale.data")
 Erreur dans .m2sparse(from, paste0(kind, "g", repr), NULL, NULL) : 
   attempt to construct sparseMatrix with more than 2^31-1 nonzero entries

The object has 160 samples for a total of 1188484 cells:

An object of class Seurat 
33359 features across 1188484 samples within 1 assay 
Active assay: RNA (33359 features, 3000 variable features)
 321 layers present: counts.C-AP, counts.C-IJ, counts.C-JC, counts.C-RE, counts.C1_Cousin_P1, counts.C11_bis, counts.C13, counts.C2_Brother_P4, counts.C20-59-01-C001, counts.C20-59-01-C002, counts.C20-59-01-C003, counts.C20-59-01-P005, counts.C20-59-01-P010, counts.C20-59-01-P016, counts.C20-59-01-P017, counts.C20-59-01-P018,     counts.C20-59-01-P020, counts.C20-59-01-P022, counts.C20-59-01-P023, counts.C20-59-01-P024, counts.C20-59-01-P025, counts.C20-59-01-P026, counts.C20-59-01-P027, counts.C20-59-01-P028, counts.C20-59-01-P029, counts.C20-59-01-P032, counts.C20-59-01-P033, counts.C20-59-01-P035, counts.C20-59-01-P036, counts.C20-59-01-P039, counts.C20-59-01-P042, counts.C20-59-01-P043, counts.C20-59-01-P044, counts.C20-59-01-P045, counts.C20-59-01-P046, counts.C20-59-01-P047, counts.C20-59-01-P048, counts.C20-59-01-P049, counts.C20-59-01-P050, counts.C20-59-01-P052, counts.C20-59-01-P053, counts.C20-59-01-P056, counts.C20-59-01-P058, counts.C20-59-01-P059, counts.C20-59-01-P062, counts.C20-59-01-P063, counts.C20-59-01-P064, counts.C20-59-01-P065, counts.C20-59-01-P066, counts.C20-59-01-P067, counts.C20-59-01-P068, counts.C20-59-01-P069, counts.C20-59-01-P071, counts.C20-59-01-P072, counts.C20-59-01-P073, counts.C20-59-01-P074, counts.C20-59-01-P075, counts.C20-59-01-P076, counts.C20-59-01-P077, counts.C20-59-01-P079, counts.C20-59-01-P080, counts.C20-59-01-P081, counts.C20-59-01-P082, counts.C20-59-01-P084, counts.C20-59-01-P085, counts.C20-59-01-P086, counts.C20-59-01-P087, counts.C20-59-01-P088, counts.C20-59-01-P090, counts.C20-59-01-P093, counts.C20-59-01-P094, counts.C20-59-01-P095, counts.C20-59-01-P096, counts.C20-59-01-P097, counts.C20-59-01-P101, counts.C20-59-01-P102, counts.C20-59-01-P103, counts.C20-59-01-P107, counts.C20-59-01-P109, counts.C20-59-01-P111, counts.C20-59-02-C001, counts.C20-59-02-P001, counts.C20-59-02-P002, counts.C20-59-04-C007, counts.C20-59-04-C010, counts.C20-59-04-C013, counts.C20-59-04-C014, counts.C20-59-04-C015, counts.C20-59-04-C016, counts.C20-59-04-C017, counts.C20-59-04-C018, counts.C20-59-04-C019, counts.C20-59-04-C020, counts.C20-59-04-C021, counts.C20-59-04-C022, counts.C20-59-04-C023, counts.C20-59-04-C024, counts.C20-59-04-C026, counts.C20-59-05-C001, counts.C20-59-05-C002, counts.C20-59-05-C004, counts.C20-59-05-C009, counts.C20-59-05-C010, counts.C20-59-05-C016, counts.C20-59-05-C020, counts.C20-59-07-C001, counts.C20-59-07-C002, counts.C20-59-07-C003, counts.C20-59-07-C005, counts.C20-59-07-C006, counts.C20-59-07-C010, counts.C20-59-07-C013, counts.C20-59-07-C018, counts.C20-59-07-C019, counts.C20-59-07-C020, counts.C20-59-07-C022, counts.C20-59-07-C025, counts.C20-59-07-C028, counts.C20-59-07-C029, counts.C20-59-07-C030, counts.C22, counts.C26, counts.C27, counts.C3_Mother_P6_CTLA4, counts.C4_bis, counts.C5, counts.C6, counts.C7, counts.Ced_MOU, counts.Ena_ERR, counts.Mae_JAW, counts.P1_CTLA4_ht_T, counts.P1_LRBA_hmz, counts.P1_NBEAL2_hmz, counts.P1_NRAS_ht_LS_JMML, counts.P2_CTLA4_ht_hc_Mother_P1, counts.P2_LRBA_hmz, counts.P2_NBEAL2_ht_hc_Mother_P1, counts.P3_CTLA4_ht_hc_Father_P6, counts.P3_KRAS_ht_leukemia_SJMML_S3, counts.P3_KRAS_ht_RALD_leukemia_S2, counts.P3_KRAS_ht_RALD_S1, counts.P3_NBEAL2_hmz, counts.P4_CTLA4_ht_T, counts.P4_KRAS_ht_SJMML, counts.P5_CTLA4_ht, counts.P5_CTLA4_ht_T, counts.P5_LRBA_compound, counts.P5_NBEAL2_compound, counts.P6_CTLA4_ht, counts.P6_CTLA4_ht_T, counts.P6_LRBA_hmz, counts.P6_NBEAL2_compound, counts.P7_CTLA4_hmz, counts.P7_LRBA_hmz, counts.P7_NBEAL2_hmz, counts.P8_CTLA4_ht_hc_mother_P7, counts.P9_CTLA4_ht, counts.P9_CTLA4_ht_T, data.C-AP, data.C-IJ, data.C-JC, data.C-RE, data.C1_Cousin_P1, data.C11_bis, data.C13, data.C2_Brother_P4, data.C20-59-01-C001, data.C20-59-01-C002, data.C20-59-01-C003, data.C20-59-01-P005, data.C20-59-01-P010, data.C20-59-01-P016, data.C20-59-01-P017, data.C20-59-01-P018, data.C20-59-01-P020, data.C20-59-01-P022, data.C20-59-01-P023, data.C20-59-01-P024, data.C20-59-01-P025, data.C20-59-01-P026, data.C20-59-01-P027, data.C20-59-01-P028, data.C20-59-01-P029, data.C20-59-01-P032, data.C20-59-01-P033, data.C20-59-01-P035, data.C20-59-01-P036, data.C20-59-01-P039, data.C20-59-01-P042, data.C20-59-01-P043, data.C20-59-01-P044, data.C20-59-01-P045, data.C20-59-01-P046, data.C20-59-01-P047, data.C20-59-01-P048, data.C20-59-01-P049, data.C20-59-01-P050, data.C20-59-01-P052, data.C20-59-01-P053, data.C20-59-01-P056, data.C20-59-01-P058, data.C20-59-01-P059, data.C20-59-01-P062, data.C20-59-01-P063, data.C20-59-01-P064, data.C20-59-01-P065, data.C20-59-01-P066, data.C20-59-01-P067, data.C20-59-01-P068, data.C20-59-01-P069, data.C20-59-01-P071, data.C20-59-01-P072, data.C20-59-01-P073, data.C20-59-01-P074, data.C20-59-01-P075, data.C20-59-01-P076, data.C20-59-01-P077, data.C20-59-01-P079, data.C20-59-01-P080, data.C20-59-01-P081, data.C20-59-01-P082, data.C20-59-01-P084, data.C20-59-01-P085, data.C20-59-01-P086, data.C20-59-01-P087, data.C20-59-01-P088, data.C20-59-01-P090, data.C20-59-01-P093, data.C20-59-01-P094, data.C20-59-01-P095, data.C20-59-01-P096, data.C20-59-01-P097, data.C20-59-01-P101, data.C20-59-01-P102, data.C20-59-01-P103, data.C20-59-01-P107, data.C20-59-01-P109, data.C20-59-01-P111, data.C20-59-02-C001, data.C20-59-02-P001, data.C20-59-02-P002, data.C20-59-04-C007, data.C20-59-04-C010, data.C20-59-04-C013, data.C20-59-04-C014, data.C20-59-04-C015, data.C20-59-04-C016, data.C20-59-04-C017, data.C20-59-04-C018, data.C20-59-04-C019, data.C20-59-04-C020, data.C20-59-04-C021, data.C20-59-04-C022, data.C20-59-04-C023, data.C20-59-04-C024, data.C20-59-04-C026, data.C20-59-05-C001, data.C20-59-05-C002, data.C20-59-05-C004, data.C20-59-05-C009, data.C20-59-05-C010, data.C20-59-05-C0, data.C20-59-05-C020, data.C20-59-07-C001, data.C20-59-07-C002, data.C20-59-07-C003, data.C20-59-07-C005, data.C20-59-07-C006, data.C20-59-07-C010, data.C20-59-07-C013, data.C20-59-07-C018, data.C20-59-07-C019, data.C20-59-07-C020, data.C20-59-07-C022, data.C20-59-07-C025, data.C20-59-07-C028, data.C20-59-07-C029, data.C20-59-07-C030, data.C22, data.C26, data.C27, data.C3_Mother_P6_CTLA4, data.C4_bis, data.C5, data.C6, data.C7, data.Ced_MOU, data.Ena_ERR, data.Mae_JAW, data.P1_CTLA4_ht_T, data.P1_LRBA_hmz, data.P1_NBEAL2_hmz, data.P1_NRAS_ht_LS_JMML, data.P2_CTLA4_ht_hc_Mother_P1, data.P2_LRBA_hmz, data.P2_NBEAL2_ht_hc_Mother_P1, data.P3_CTLA4_ht_hc_Father_P6, data.P3_KRAS_ht_leukemia_SJMML_S3, data.P3_KRAS_ht_RALD_leukemia_S2, data.P3_KRAS_ht_RALD_S1, data.P3_NBEAL2_hmz, data.P4_CTLA4_ht_T, data.P4_KRAS_ht_SJMML, data.P5_CTLA4_ht, data.P5_CTLA4_ht_T, data.P5_LRBA_compound, data.P5_NBEAL2_compound, data.P6_CTLA4_ht, data.P6_CTLA4_ht_T, data.P6_LRBA_hmz, data.P6_NBEAL2_compound, data.P7_CTLA4_hmz, data.P7_LRBA_hmz, data.P7_NBEAL2_hmz,     data.P8_CTLA4_ht_hc_mother_P7, data.P9_CTLA4_ht, data.P9_CTLA4_ht_T, scale.data
 3 dimensional reductions calculated: pca, red_rpca, umap_rpca

@mass-a
Copy link
Member

mass-a commented Feb 21, 2024

This works for me with UCell 2.7.3:

library(UCell)
data(sample.matrix)
obj <- Seurat::CreateSeuratObject(sample.matrix)

obj$Tag <- "tag1"
obj$Tag[1:300] <- "tag2"

obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Tag)
#obj now has two layers
obj
An object of class Seurat 
20729 features across 600 samples within 1 assay 
Active assay: RNA (20729 features, 0 variable features)
 2 layers present: counts.tag2, counts.tag1
gene.sets <- list(Tcell = c("CD2","CD3E","CD3D"),
                  Myeloid = c("SPI1","FCER1G","CSF1R"))
obj <- AddModuleScore_UCell(obj,features = gene.sets)

Can you confirm?

@frac2738
Copy link

frac2738 commented Feb 21, 2024

Your code was not working, but a full restart of the R session did the trick.
Your code now works and it seems to be working also on my dataset.
Thanks!

@mpizzagalli777
Copy link

Hi, Thanks for this wonderful package. I was wondering if it is possible to run this analysis on integrated objects. I have been running an analysis on a Seurat object that has been integrated but when I run AddModuleScore_UCell

AddModuleScore_UCell(object, assay = "integrated", features = markers)

I get an error stating

Warning: Over half of genes (100%) in specified signatures are missing from data. ...

However, when I define the assay as SCT, the function works. Is this an issue with the naming convention used after integration?

AddModuleScore_UCell(object, assay = "SCT", features = markers)

Thanks so much for the help.

@mass-a
Copy link
Member

mass-a commented Apr 3, 2024

Hello,
if you integrated using Seurat, by default the "integrated" assay will only contain the variable genes. You can verify e.g. by dim(obj@assays$integrated@data). That is why UCell complains about missing genes (they are not present in the assay). You should be able to specify to the Seurat integration functions to generate corrected values for all genes. However, I would recommend calculating signature scores on the uncorrected assays (RNA or SCT). While batch effects can be large on the global transcriptome, you could expect them to have a small impact on the reduced gene sets used for signature scoring.

@mpizzagalli777
Copy link

Thanks so much for the quick reply! That makes sense. I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants