-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UMAP looks like a line when neighborhood size was determined by using cell type labels #117
Comments
Can you give me a sense of how large the cell type labels are? It would be great if you could show me the number of cells assigned to each label. |
Here's tables showing number of cells assigned to each label. Another question is, would it be the best if the input cell number of different species are comparable? I am working with 200 cells of one species and 8,000 cells of another species, was thinking about downsampling the 8,000 one. Thank you!! |
I think SAMap can be robust to dataset size disparities, but I would encourage you to try downsampling and check if the results change. I would also encourage changing the (poorly documented)
|
Instead of using Can you try using If you use |
Thanks a lot for your suggestions, I will try it. |
Thanks for the useful tool!
I noticed that in my results, some areas look like solid lines (for example the cluster at the top in the screenshot below) in the UMAP. I wonder if this is due to that SAM run was set to neighborhood size determined by using cell type labels provided by myself. Does this look normal to you?
And when I check the UMAPs before SAMap stitch them together, they both look "normal" to me.
sam1:
sam2:
Also, in my test run, where I didn't use cell type lablels to determine neighborhood size, hopping along each cell's outgoing edges was used instead. The UMAP looks more "normal" to me.
Any comments or suggestions will be highly appreciated!
The script I used is attached below (paths were replaced by ...):
from samap.mapping import SAMAP
from samap.analysis import (get_mapping_scores, GenePairFinder,
sankey_plot, chord_plot, CellTypeTriangles,
ParalogSubstitutions, FunctionalEnrichment,
convert_eggnog_to_homologs, GeneTriangles)
from samalg import SAM
import pandas as pd
import anndata
from joblib import dump, load
zf_data = anndata.read_h5ad('....')
pf_data = anndata.read_h5ad('....')
sam1 = SAM(counts = zf_data)
sam1.preprocess_data(filter_genes = False)
sam1.run(batch_key = 'orig.ident',
npcs = 30)
sam2 = SAM(counts = pf_data)
sam2.preprocess_data(filter_genes = False)
sam2.run(npcs = 20)
sams = {'zf': sam1, 'pf': sam2}
sm = SAMAP(sams,
keys = {'zf': 'cell_type', 'pf': 'cell_type'},
f_maps = '...',
save_processed = True)
Thanks very much in advance!
Di
The text was updated successfully, but these errors were encountered: