Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applying cNMF on semi-supervised scRNA clustering #87

Open
liangxs7015 opened this issue May 22, 2024 · 2 comments
Open

Applying cNMF on semi-supervised scRNA clustering #87

liangxs7015 opened this issue May 22, 2024 · 2 comments

Comments

@liangxs7015
Copy link

liangxs7015 commented May 22, 2024

Hi,

Thank you for creating such a wonderful tool! I am currently working with scRNA-seq data consisting of 8 million cells and I am particularly interested in analyzing 100 specific genes. I would like to use cNMF to perform semi-supervised clustering to understand how these genes might affect cell status. This means that I will extract the 100*800,000 matrix from the orignial 50,000 * 800,000 matrix as an input.

Could you please let me know if this application is feasible using cNMF?

Thank you very much for your assistance!

@dylkot
Copy link
Owner

dylkot commented May 24, 2024

Hi, I don't fully understand what you mean by semi-supervised clustering in this context. In general cNMF doesn't do "hard clustering" in the usual sense, it outputs continuous scores for each program for each cell. Is that what you are looking for? There shouldn't be any issue with only inputing an 800K x 100 matrix and setting the number of highly variable genes to use to 100.

@liangxs7015
Copy link
Author

Hi, I don't fully understand what you mean by semi-supervised clustering in this context. In general cNMF doesn't do "hard clustering" in the usual sense, it outputs continuous scores for each program for each cell. Is that what you are looking for? There shouldn't be any issue with only inputing an 800K x 100 matrix and setting the number of highly variable genes to use to 100.

Thank you for your quick reply. By 'semi-supervised,' I mean that the input genes for the matrix were taken from a predefined gene list. My goal in running this was to observe how the cells would be divided into different states according to the given gene list.

Thanks to your suggestion, I have successfully obtained five GEPs and plan to classify all cells from different datasets into five states based on their highest GEP score in "usage". I'm still working on it and am not entirely sure if this strategy is reasonable. Sorry for the delayed response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants