How to determinate Appropriate k Values #5

akikuya · 2024-05-24T01:53:47Z

Hello, Thank you for providing such an excellent package.
In the GeneNMF demo, the code snippet:

geneNMF.programs <- multiNMF(seu.list, assay="SCT", slot="data", k=4:9, L1=c(0,0), do_centering=TRUE, nfeatures = 2000)

could you please explain why the parameter k was set to range from 4 to 9?
Thank you!

The text was updated successfully, but these errors were encountered:

mass-a · 2024-06-10T10:21:08Z

Hello and thanks for your interest in GeneNMF.
When performing NMF on a dataset, we aim at reducing the dimensionality of the count matrix from several hundreds/thousand variable genes to a few 'k' dimensions, roughly corresponding to gene programs. The parameter range for k should correspond to the expected number of gene programs in individual datasets; NMF decomposition will be performed for each value in the range.
As with most parameters, there is not a single answer on the best value to assign. I would suggest to experiment with different ranges, corresponding to different granularity levels for the matrix factorization. In the example code snippet, we used k=4:9 (a shortcut for all integers between 4 and 9) because we could reasonably expect this range of major immune cell types in individual samples. One could of course select a different range to attempt capturing different levels of granularity.
I hope this is helpful.
-m

mass-a closed this as completed Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to determinate Appropriate k Values #5

How to determinate Appropriate k Values #5

akikuya commented May 24, 2024

mass-a commented Jun 10, 2024

How to determinate Appropriate k Values #5

How to determinate Appropriate k Values #5

Comments

akikuya commented May 24, 2024

mass-a commented Jun 10, 2024