-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing genes in allGenes.results.txt #37
Comments
Can you send the exact commands run using JunctionSeq, and the log output it printed to the console? Did you use the "use.multigene.aggregates" parameter? The default is FALSE. |
Thanks. I was able to get the information for all genes after using the parameter |
Hi Steve I recently processed some data with reads of 36 bp size and around 33% of all features were multigene aggregates (RNA-seq atlas for different mouse tissues). The problem is that a gene is classified entirely as multigene aggregate even if only one small exon is shared with another gene. Having also the information of the splice junction, I think that it's quite a loss to remove the entire gene because of a single exon. I think that it would be better to label only the features that are really affected by multiple genes as multigene aggregates (instead of all features of a gene that overlaps at a small spot with another gene). Or to set a different default for use.multigene.aggregates (or at least make it more clear in the JunctionSeq user guide, at the moment it's just used in one code example but without further explanation). Nice tools otherwise. Best regards, Marc Besides - with the same annotation but different data (read length of 125 bp and different cell type) around 7.5 % are multigene aggregates. Is this difference due to the read length or due to the different cell/tissue type? |
Yes in DEXSeq the exons shared between two genes are removed rather than discarding the entire gene. However, the Junctionseq seems to throw away such genes which are kinda unfair. |
I followed QoRTs "Example Walkthrough" to run JunctionSeq and got the results. However,
I noticed the number of aggregate genes in GFF vs allGenesresults file are different.
GFF file:
grep -c "aggregate_gene" withNovel.forJunctionSeq.gff
= 54,720Results file:
cut -f 2 allGenes.results.txt | sort | uniq | wc -l
= 52447The numbers does not match. Is there any default filtering applied in the
writeCompleteResults
method? This matters to me because a specific gene of interest which found to be DE (following a custom HtSeq-DESeq2 pipeline) was excluded from the final results derived from JunctionSeq.The text was updated successfully, but these errors were encountered: