Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'connectivities' when running SAMAP() #66

Closed
archavan opened this issue Feb 18, 2022 · 6 comments
Closed

KeyError: 'connectivities' when running SAMAP() #66

archavan opened this issue Feb 18, 2022 · 6 comments

Comments

@archavan
Copy link

Thanks for the SAMap package! I am using it to align amphioxus and human datasets. I am following the vignette, and I get a KeyError at the step of running the SAMAP() function that I am not sure I understand the source of. Does this mean that I need to use the keys argument to specify colum in adata.obs to be used for clustering information?

I first created the fasta-header to gene-id mapings

names_bf = pd.read_csv("../results/11_samap_hsap/02_sc-datasets/bf/protein-id-to-gene-id-mapping.csv")
names_hs = pd.read_csv("../results/11_samap_hsap/02_sc-datasets/hs/fawkner-corbett/protein-id-to-gene-name-mapping.csv")

n1 = list(zip(names_bf["protein_id"], names_bf["gene_id"]))
n2 = list(zip(names_hs["protein_id"], names_hs["gene_symbol"]))

names = {"bf":n1, "hs":n2}

I start with unprocessed h5ad files that have (1) unprocessed counts and (2) metadata, including clustering information.

fn1 = "../results/11_samap_hsap/02_sc-datasets/bf/bf.h5ad"
fn2 = "../results/11_samap_hsap/02_sc-datasets/hs/fawkner-corbett/hs.h5ad"

filenames = {"bf":fn1, "hs":fn2}

sam1 = SAM()
sam1.load_data(fn1)

sam2 = SAM()
sam2.load_data(fn2)

sams = {"bf":sam1, "hs":sam2}

The I run:

sm = SAMAP(sams, f_maps = "../results/11_samap_hsap/01_blast/maps/", names = names)

which produces the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_3202/544095883.py in <module>
----> 1 sm = SAMAP(sams, f_maps = "../results/11_samap_hsap/01_blast/maps/", names = names)

/gpfs/ysm/project/wagner/arc78/conda_envs/samap/lib/python3.7/site-packages/samap/mapping.py in __init__(self, sams, f_maps, names, keys, resolutions, gnnm, save_processed)
    149 
    150             if key == "leiden_clusters":
--> 151                 sam.leiden_clustering(res=res)
    152 
    153             if "PCs_SAMap" not in sam.adata.varm.keys():

/gpfs/ysm/project/wagner/arc78/conda_envs/samap/lib/python3.7/site-packages/samalg/__init__.py in leiden_clustering(self, X, res, method, seed)
   1492 
   1493         if X is None:
-> 1494             X = self.adata.obsp["connectivities"]
   1495             save = True
   1496         else:

/gpfs/ysm/project/wagner/arc78/conda_envs/samap/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __getitem__(self, key)
    146 
    147     def __getitem__(self, key: str) -> V:
--> 148         return self._data[key]
    149 
    150     def __setitem__(self, key: str, value: V):

KeyError: 'connectivities'
@atarashansky
Copy link
Owner

You have to process the datasets with SAM or load in data that has already been processed by another method. The common convention is that the neighborhood graph is stored in adata.obsp[‘connectivities’]

the key error means that this graph is missing. Has the data been processed?

@atarashansky
Copy link
Owner

Ah, I see it hasn’t been processed. Instead of loading the data into SAM objects please pass in the paths to the files. So instead of a dictionary of SAM objects, it should be a dictionary of file paths.

@archavan
Copy link
Author

Ah, I see -- that makes sense! It's working now that I passed in the dictionary of file paths. Thanks so much for your quick response.

@xiangyupan
Copy link

Hi @archavan
I met the same error with you. I am still confusing the answers of atarashansky. Could you show me the detail of code to pass dictionary of file paths?
Thanks all.

@archavan
Copy link
Author

archavan commented May 4, 2023

Hi @xiangyupan,

Instead of passing a dictionary of SAM objects, pass a dictionary of paths to h5ad files, like this:

fn1 = "path/to/species1/aa.h5ad"
fn2 = "path/to/species2/bb.h5ad"

filenames = {"aa":fn1, "bb":fn2}

sm = SAMAP(filenames, [other options])

@xiangyupan
Copy link

Thank you very much @archavan
It 's clear now. The problem has been solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants