Skip to content

Latest commit

 

History

History
109 lines (86 loc) · 6.71 KB

README.md

File metadata and controls

109 lines (86 loc) · 6.71 KB

Human metapneumovirus (HMPV) genome annotation


How to annotate HMPV genomes with VADR

Steps for using VADR for HMPV annotation:

  1. Download and install the latest version of VADR, following the instructions on this page. Alternatively, you can use the StaPH-B VADR 1.6.3-hav-flu2 docker image created by Curtis Kapsak (docker image names: staphb/vadr:1.6.3-hav-flu2 and staphb/vadr:latest), available on dockerhub and quay. A brief README for the docker image is here.

  2. Clone the latest HMPV VADR model from this repository (current release v1.0)
    git clone git@github.com:greninger-lab/vadr-models-hmpv.git
    or download the current release from here.
    Note the path to the directory name created plus the /hmpv subdirectory as <hmpv-models-dir-path> for step 4.

  3. Remove terminal ambiguous nucleotides from your input fasta sequence file using the fasta-trim-terminal-ambigs.pl script in $VADRSCRIPTSDIR/miniscripts/.

    To remove terminal ambiguous nucleotides from your sequence file <input-fasta-file> and to remove short and long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 16000 <input-fasta-file> > <trimmed-fasta-file>
  1. Run the v-annotate.pl program on an input trimmed fasta file with HMPV sequences using the recommended command below.
v-annotate.pl -r --mkey hmpv --mdir <hmpv-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>
  1. After running the v-annotate.pl command in step 4, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

    More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md


HMPV VADR models

  • The VADR model library for HMPV annotation includes 6 HMPV models representing 6 different subgroups: A1, A2a, A2b1, A2b2, B1 and B2.

Reference

  • The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3

  • This page was adapted for HMPV from Mpox virus annotation