-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
determining the less trustworthy log2fc values #27
Comments
Hi Cathy, that is a fair question. If you could provide a reproducible, I am happy to discuss specifics of the issues that you encountered. But I will try to give some pointers which are hopefully already useful:
This is a warning generated by
That warning is interesting, as I am not sure where it is coming from. Here, I would need a reproducible example to say more.
In my opinion the p-value associated with a log2fc is still the best measure to understand credible a certain change is. By default the p-value is calculated with a likelihood ratio test. However, you might also be interested in this earlier discussion about using the standard error associated with each coefficient fit as an alternative. For more details see #12. Best, |
Sorry for this late response. I performed some filtering which may have dealt with the errors of “encountered non-positive size factor estimates” and “singular gradient” for now. However I am actually thinking about a case such as #22 because my plots are similar distribution, and in that case p-value Is not always useful as a filter. In that issue it's suggested to do something such as set all LFC above 15 to Inf. However I've found sometimes the threshold as determined by eye is smaller than 15. Do you have any suggestions for how I can discard lfc values from the two extremes of this 'pattern' systematically without looking by eye? Thanks! |
Hi Cathy, thanks for reaching out again and for your feedback :)
Can you explain a bit more why the p-values are not a good filter? Note that the recommendation to change LFC > 15 to Inf is just for plotting. It uses the trick that
Good question. Unfortunately, not really right now. The cause of the extreme LFC is that the parameter estimation algorithm converges to an extreme value if one of the groups consists of only zeros and the other group has non-zero counts. One option would be to specifically filter for such cases, but that can get quite complicated for more complex models. Best, Constantin |
Thanks for your reply!
My thinking is that since for some of these genes because the counts are much smaller in one group, the lfc might not be trustworthy even if the p-value is very small (which I am seeing sometimes). I guess this should be partly dealt with by filtering but as you mention it's complicated to perform this filtering to account for multiple types of groups. |
Hello, thanks very much for your package. I just want to follow up on this point from the vignette:
It seems that depending on what my design is, the threshold to separate the "three groups" of log2fc values can be as small as 5. I also got the warnings "“encountered non-positive size factor estimates” and “singular gradient” when I was running glm_gp for the fit, I don't know if it's related. I'm assuming these are still "large lfc values" though they are < 20. Is there a better way you could recommend to separate out the genes with less trustworthy log2fc values than by looking visually?
The text was updated successfully, but these errors were encountered: