You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's been internally requested that we have a tool that can produce things like (A) FRiP scores or (B) histograms of reads per feature type (exons, UTR, etc.). Such a tool might be called plotEnrichment. The output would be something like an image and an optional table of values. The input would be a BED or GTF file. One difficulty is that deeptoolsintervals is currently only storing a single feature type. libGTF, on which it's based, can store everything, so this shouldn't be a real problem. What I expect I'll do is implement a slightly tweaked GTF/BED parser only for this tool that will end up using the same C code. Then, we'll use mapReduce (possibly in a slightly modified form) to chop the genome into bins and then iterate over reads in each (after first checking to ensure that there's at least one GTF/BED feature in said region).
I guess there's a reason that I implemented counting overlap sets in libGTF...
The text was updated successfully, but these errors were encountered:
There's now a feature/plotEnrichment branch where this is largely implemented (though it's a definite work in progress) via the command plotEnrichment. The image output is as below:
peaks are any regions in a BED file, whereas for GTF files the feature column is used. I would suggest that if people want introns that they annotated them (or just vaguely estimate them as the difference between gene and exon, though the real value will be higher).
As requested, BED files now get individual labels (either the file name or whatever you specify with --regionLabels). GTF files will always use the feature column and ignore --regionLabels. Here are two bed files (genes19.bed and genesX.bed) with new labels:
It's been internally requested that we have a tool that can produce things like (A) FRiP scores or (B) histograms of reads per feature type (exons, UTR, etc.). Such a tool might be called
plotEnrichment
. The output would be something like an image and an optional table of values. The input would be a BED or GTF file. One difficulty is thatdeeptoolsintervals
is currently only storing a single feature type. libGTF, on which it's based, can store everything, so this shouldn't be a real problem. What I expect I'll do is implement a slightly tweaked GTF/BED parser only for this tool that will end up using the same C code. Then, we'll use mapReduce (possibly in a slightly modified form) to chop the genome into bins and then iterate over reads in each (after first checking to ensure that there's at least one GTF/BED feature in said region).I guess there's a reason that I implemented counting overlap sets in libGTF...
The text was updated successfully, but these errors were encountered: