covmed overestimates coverage? #19

hdashnow · 2017-01-24T04:15:25Z

I've noticed that covmed estimates higher median coverage than other tools. For example for a particular whole genome covmed estimates 33.4, while Picard CollectWgsMetrics estimates 27.
I've performed similar calculations on exomes where I get median coverage of 199.71 with covmed (using the region argument) compared with 189 using bedtools (take the median of counts per base over target region). I've found consistently higher results from covmed compared with picard and bedtools across a number of exomes and genomes. The size of the difference is variable.

I wonder if you have any idea why this is occurring?

One possibility that springs to mind for exomes in particular is that reads outside the target region could be counted and so cause it to overestimate the coverage.

brentp · 2017-01-24T14:07:30Z

yeah, I've noticed this as well. I'll have a look today. Picard and bedtools are doing actual coverage calculations across the whole bam (I'm pretty sure, anyway) while covmed is estimating based on a sample, but it still should be able to have a pretty good estimate.

brentp · 2017-01-24T17:04:41Z

@hdashnow would you give one of the attached binaries a try (I have to gzip to attach here so you'll have to unzip and chmod +x).
This should give a more accurate estimate, but I'd like to see how it performs for your cases.

You can now do : goleft covmed *.bam
so it's easier to run on a group of bams.
goleft_osx.gz
goleft_linux64.gz

do this by scaling the coverage estimate by the proportion of reads that are dup|qcfail|secondary|unmapped. see #19

brentp · 2017-01-24T17:08:22Z

and a caveat is that goleft is likely to be inaccurate for exome or targetted, but I'll improve that a bit more in the future.

do this by scaling the coverage estimate by the proportion of reads that are dup|qcfail|secondary|unmapped. see #19

hdashnow · 2017-01-24T23:13:15Z

Good idea adding that filter. It made the estimates slightly smaller. e.g. 33.04 instead of 33.4. Still nowhere near Picard.

brentp added a commit that referenced this issue Jan 24, 2017

covmed: attempt to give a better coverage estimate

a56591e

do this by scaling the coverage estimate by the proportion of reads that are dup|qcfail|secondary|unmapped. see #19

brentp added a commit that referenced this issue Jan 24, 2017

covmed: attempt to give a better coverage estimate

5cae850

do this by scaling the coverage estimate by the proportion of reads that are dup|qcfail|secondary|unmapped. see #19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

covmed overestimates coverage? #19

covmed overestimates coverage? #19

hdashnow commented Jan 24, 2017

brentp commented Jan 24, 2017

brentp commented Jan 24, 2017

brentp commented Jan 24, 2017

hdashnow commented Jan 24, 2017 •

edited

Loading

covmed overestimates coverage? #19

covmed overestimates coverage? #19

Comments

hdashnow commented Jan 24, 2017

brentp commented Jan 24, 2017

brentp commented Jan 24, 2017

brentp commented Jan 24, 2017

hdashnow commented Jan 24, 2017 • edited Loading

hdashnow commented Jan 24, 2017 •

edited

Loading