DeepVariant 0.9.0
- In the v0.9.0 release, we introduce best practices for merging DeepVariant samples.
- Added visualizations of variant output for visual QC and inspection.
- Improved Indel accuracy for WGS and WES (error reduction of 36% on the WGS case study) by reducing Indel candidate generation threshold to 0.06.
- Improved WES model accuracy by expanding training regions with a 100bp buffer around capture regions and additional training at lower exome coverages.
- Improved performance for new PacBio Sequel II chemistry and CCS v4 algorithm by training on additional data.
Full release notes:
New documentation:
- Added a tutorial for merging WES trio.
- Visualization functionality and documentation: VCF stats report.
Changes to Docker images, code, and models:
- Docker images now live in Docker Hub google/deepvariant in addition to gcr.io/deepvariant-docker/deepvariant.
- For WES, added 100bps buffer to the capture regions when creating training examples.
- For WES, increased training examples with lower coverage exomes, down to 30x.
- For PACBIO, added training data for Sequel II v2 chemistry and samples processed with CCS v4 algorithm.
- Loosened the restriction that the BAM files need to have exactly one sample_name. Now if there are multiple samples in the header, use the first one. If there was none, use a default.
- Changes in realigner code. Realigner aligns reads to haplotypes first and then realigns them to the reference. With this change some of the haplotypes (with not enough read support) are now discarded. This results in fewer reads needing to be realigned. Theoretically, this fix should improve FP rate. It also helps to resolve a GitHub issue.
Changes to flags:
- Added
--sample_name
flag to run_deepvariant.py. - Reduced default for
vsc_min_fraction_indels
to 0.06 for Illumina data (WGS
andWES
mode) which increases sensitivity. - Expanded the use of
--reads
to take multiple BAMs in a comma-separated list. - Use
--ref
for CRAM by default. (Set--use_ref_for_cram
to true by default) - Added support for BAM output for realigner debugging. See
--realigner_diagnostics
and--emit_realigned_reads
flags in realigner.py.