Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch between summary for algorithm convergence and learning curve #377

Open
KaczorowskiLab opened this issue Aug 6, 2024 · 0 comments

Comments

@KaczorowskiLab
Copy link

KaczorowskiLab commented Aug 6, 2024

Hi @sjfleming,

Thank you for developing and maintaining this very useful tool. I had two questions stemming from use of the tool on our snRNA-seq dataset. All the samples in the set were run at LR of 1e-5 since I had more than 50% of samples with warnings when run at the default LR value.

Question1: This is related to the convergence of the algorithm assessments. Two samples in my set had the "slightly unusual behavior" warning in the summary. However, examining the learning curves, the shape and values don't seem far off from other examples where in the summary came back as being "normal". The learning curves also look slightly better at the lower training rate (where in the warning appears) when compared to the default value. The performance on remaining metrics is also comparable. Is this a false warning ?

Sample 1: LR = 1e-5 (summary gives warning to try lower LR)

Screenshot 2024-08-06 at 11 15 22 AM

Similar looking learning curve from a different sample at the same LR (summary says curve looks normal)

Screenshot 2024-08-06 at 11 16 55 AM

Sample 1: LR= 1e-4 (summary says curve looks normal)

Screenshot 2024-08-06 at 11 18 37 AM

The shape of the learning curve for Sample 2 is similar and follows same trend as sample 1.

Question 2: This is related to the genes removed and the including warnings. All of the samples in the set both the top 10 genes in the table and a huge list of warnings for genes not included in the table. The warnings in #342 are related to the genes shown in the table. Based on your comment in #292 it seems about 80-90% of the counts associated with the genes are also being removed from the cells (fraction_removed_cells). However, this information is not shown for all the genes included in the warnings. Is this normal behavior ? As list of genes are also not consistent between samples, is there a way to extract this information to cross check for some of the individual genes (your comment in #342). Here are two example outputs (warning list truncated in the image):

Screenshot 2024-08-06 at 11 34 42 AM

Not all genes are mitochondrial in all samples:

Screenshot 2024-08-06 at 11 43 30 AM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant