You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of them is constructing term-document matrix from the original count matrix. This is typically followed by singular value decomposition (SVD) — the same technique that conventional principal component analysis (PCA) uses — to generate latent components, and these two steps together are referred to as latent semantic indexing (LSI).
From that explanation, you would expect a function called lsi() to run tf-idf() and then svd() -- so would take a raw counts matrix as input. But it seems like the function lsi() expects you to have already run tf-idf().
Clarification of the docs would help here.
The text was updated successfully, but these errors were encountered:
It might be also worthwhile to repurpose ac.tl.lsi() to run both TF-IDF and SVD while adding ac.tl.svd().
This should also make it more intuitive for the users of other commonly used toolkits for scATAC-seq processing.
We're currently refactoring the ATAC module, and that might be a welcome change to make these steps more intuitive.
If you have more thoughts on this or other matters, please do share, whether here or e.g. on Zulip (https://scverse.org/join/).
It looks like the function ac.tl.lsi is mostly just a call to
scipy.sparse.linalg.svds
.muon/muon/_atac/tools.py
Line 52 in d0e2c22
I think this function expects that you've run
ac.pp.tfidf
beforehand, but that isn't mentioned in the docstring of the function / the API docs.This is especially confusing because Muon's normalization docs read:
From that explanation, you would expect a function called
lsi()
to runtf-idf()
and thensvd()
-- so would take a raw counts matrix as input. But it seems like the functionlsi()
expects you to have already runtf-idf()
.Clarification of the docs would help here.
The text was updated successfully, but these errors were encountered: