Skip to content

impactful

Brent Pedersen edited this page Dec 3, 2020 · 14 revisions

In short, simply add INFO.impactful && $your_expressions to limit to variants with a higher impact (e.g. excluding synonymous and intronic). No additional files or flags are required to use this feature. To be more permissive, you can use INFO.genic which includes synonymous. And you can use INFO.highest_impact_orer Both of these can be customized by editing https://raw.githubusercontent.com/brentp/slivar/master/src/slivarpkg/default-order.txt and setting SLIVAR_IMPACTFUL_ORDER environment variable to the path to the updated file.

Often, an analyst will want to limit to variants that are of a "higher" impact. This decision is arbitrary, but, given an ordered list of all impacts annotated by existing tools, we can choose a permissive cutoff that still removes a large percentage of variant candidates.

In order to expose a simple way for the user to do minimal filtering on the highest impact in the CSQ, we have added in version 1.6 INFO.impactful; this value is true if any impact in the variant falls above the sentinel value of "IMPACTFUL_CUTOFF"in the default-order list. If none of CSQ (VEP), BCSQ (bcftools), or ANN (snpEff) are present in the VCF then this attribute is not available.

This means that, given a VCF with CSQ (VEP), BCSQ (bcftools), or ANN (snpEff) fields, a reasonable de novo filter could be:

INFO.impactful && kid.het && mom.hom_ref && dad.hom_ref && kid.GQ > 10 ...

This will, of course, exclude any synonymous variants, but this can be helpful as a first-pass, especially in whole-genomes where the number of candidates is much larger.

If CSQ, BCSQ, and ANN are all present, slivar will check if any of them contain an impact high enough to be called "impactful".

The order can be customized by copying the default-order.txt file, re-arranging the rows, and setting export SLIVAR_IMPACTFUL_ORDER=/path/to/adjusted-order.txt

genic

slivar version 0.1.11 added genic which includes (by default) all impactful impacts along with synonymous, gene, coding_sequence, mature_miRNA, 5_prime_UTR_premature_start_codon_gain_variant, 5_prime_UTR, 3_prime_UTR, initiator_codon, miRNA, non_coding_transcript_exon, non_coding_exon, nc_transcript, exon_region, conserved_intron

INFO.highest_impact_order

INFO.highest_impact_order is filled with an integer value indicating the index in default-order. With this, it's possible to use an expression like:

INFO.highest_impact_order <= ImpactOrder.missense

to limit to variants of missense or higher, or, for example to get only (highest impact of) synonymous variants with:

INFO.highest_impact_order == ImpactOrder.synonymous
Clone this wiki locally