Missing genes in allGenes.results.txt #37

sagarutturkar · 2018-09-12T16:44:13Z

I followed QoRTs "Example Walkthrough" to run JunctionSeq and got the results. However,
I noticed the number of aggregate genes in GFF vs allGenesresults file are different.

GFF file:
grep -c "aggregate_gene" withNovel.forJunctionSeq.gff = 54,720
Results file:
cut -f 2 allGenes.results.txt | sort | uniq | wc -l = 52447

The numbers does not match. Is there any default filtering applied in the writeCompleteResults method? This matters to me because a specific gene of interest which found to be DE (following a custom HtSeq-DESeq2 pipeline) was excluded from the final results derived from JunctionSeq.

The text was updated successfully, but these errors were encountered:

hartleys · 2018-09-20T15:43:34Z

Can you send the exact commands run using JunctionSeq, and the log output it printed to the console?

Did you use the "use.multigene.aggregates" parameter? The default is FALSE.

sagarutturkar · 2018-10-01T12:44:31Z

Thanks. I was able to get the information for all genes after using the parameter use.multigene.aggregates.

MWSchmid · 2019-03-18T10:56:40Z

Hi Steve

I recently processed some data with reads of 36 bp size and around 33% of all features were multigene aggregates (RNA-seq atlas for different mouse tissues). The problem is that a gene is classified entirely as multigene aggregate even if only one small exon is shared with another gene. Having also the information of the splice junction, I think that it's quite a loss to remove the entire gene because of a single exon.

I think that it would be better to label only the features that are really affected by multiple genes as multigene aggregates (instead of all features of a gene that overlaps at a small spot with another gene). Or to set a different default for use.multigene.aggregates (or at least make it more clear in the JunctionSeq user guide, at the moment it's just used in one code example but without further explanation).

Nice tools otherwise.

Best regards,

Marc

Besides - with the same annotation but different data (read length of 125 bp and different cell type) around 7.5 % are multigene aggregates. Is this difference due to the read length or due to the different cell/tissue type?

scseekers · 2020-11-11T10:31:50Z

@MWSchmid

Yes in DEXSeq the exons shared between two genes are removed rather than discarding the entire gene. However, the Junctionseq seems to throw away such genes which are kinda unfair.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing genes in allGenes.results.txt #37

Missing genes in allGenes.results.txt #37

sagarutturkar commented Sep 12, 2018 •

edited

Loading

hartleys commented Sep 20, 2018

sagarutturkar commented Oct 1, 2018

MWSchmid commented Mar 18, 2019

scseekers commented Nov 11, 2020

Missing genes in allGenes.results.txt #37

Missing genes in allGenes.results.txt #37

Comments

sagarutturkar commented Sep 12, 2018 • edited Loading

hartleys commented Sep 20, 2018

sagarutturkar commented Oct 1, 2018

MWSchmid commented Mar 18, 2019

scseekers commented Nov 11, 2020

sagarutturkar commented Sep 12, 2018 •

edited

Loading