-
Notifications
You must be signed in to change notification settings - Fork 10
qual_classifier
Rob Flickenger edited this page Aug 9, 2021
·
1 revision
The biograph qual_classifier
command assigns a genotype and quality score to variants and filters on a threshold.
See Customizing the BioGraph Pipeline for an overview of how and when to use this command.
-
--vcf
: the input VCF. -
--model
: the classifier model file. This is provided by Spiral Genetics and should match your version of BioGraph (for example,biograph_model-7.0.0.ml
). -
--grm
: the dataframe output from the truvari anno grm command. This is only required when running the quality score classifier. -
--out
: the output VCF. If unspecified, the VCF will be written to STDOUT.
-
--filter
: Calls with a quality score lower than this will be removed from the output VCF. -
--lowqual_sv
: Structural variants with a quality score lower than this will be included but markedlowq
in the filter field. -
--lowqual_ao
: SNPs and indels with a quality score lower than this will be included but markedlowq
in the filter field. The ao is short for all others (non-SVs). -
--thresh_gt
: Cutoff threshold for GT (default: 0.5)
-
--sample
: When running on a multi-sample VCF, set--sample
to choose the sample of interest. -
--clsf
: The genotype and quality classifiers are both run by default. You can run just the GT classifier with--clsf 1
, or just the quality classifier with--clsf 2
. -
--df
: A dataframe generated from the input VCF withbgvar2table.py
. If not specified, a dataframe will automatically be created. -
--threads
: Use the specified number of threads. By default, one thread is allocated per available processor.
To see a list of all biograph qual_classifier
options, use the --help
switch:
$ biograph qual_classifier --help
usage: qual_classifier [-h] -v VCF -d DATAFRAME -m MODEL [-o OUT] [-x GRM]
[-f FILTER] [-s LOWQUAL_SV] [-a LOWQUAL_AO]
[--sample SAMPLE] [--tmp TMP] [-t THREADS]
[-g THRESH_GT] [-c {GT,Qual,Both}]
Classify VCF variants
optional arguments:
-h, --help show this help message and exit
-v VCF, --vcf VCF VCF to parse
-d DATAFRAME, --dataframe DATAFRAME
Coverage DataFrame frame
-m MODEL, --model MODEL
Model to apply to data
-o OUT, --out OUT VCF to output
-x GRM, --grm GRM DataFrame conaining grm features from truvari
-f FILTER, --filter FILTER
Maximum threshold of calls to filter (0.1)
-s LOWQUAL_SV, --lowqual_sv LOWQUAL_SV
Maximum threshold for calls to mark as lowqual_sv
(0.352)
-a LOWQUAL_AO, --lowqual_ao LOWQUAL_AO
Maximum threshold for calls to mark as lowqual_ao
(0.22)
--sample SAMPLE Sample identifier (only required for multi-sample
VCFs)
--tmp TMP Temporary directory (/tmp)
-t THREADS, --threads THREADS
Number of threads to use (48)
-g THRESH_GT, --thresh_gt THRESH_GT
threshold for GT
-c {GT,Qual,Both}, --clsf {GT,Qual,Both}
Flag for which classifiers to run (Both)