Skip to content
Steve Bond edited this page Aug 17, 2017 · 2 revisions

--keep_taxa, -kt

Implemented in version 1.3

Description

Keep all records that have been annotated with a taxonomy that matches one or more of the taxa provided as arguments

Argument

taxa ( str )

Specify one or more taxa designations as a space separated list. Note that SeqBuddy will treat these designations as case insensitive, but otherwise they must be exact (i.e., no regular expressions).

Example

Input file: Caspase.gb

LOCUS       XP_012351229             462 aa            linear   PRI 13-MAY-2015
DEFINITION  PREDICTED: caspase-1 isoform X1 [Nomascus leucogenys].
ACCESSION   XP_012351229
VERSION     XP_012351229.1
DBLINK      BioProject: PRJNA62133
KEYWORDS    RefSeq.
SOURCE      Nomascus leucogenys (northern white-cheeked gibbon)
  ORGANISM  Nomascus leucogenys
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hylobatidae; Nomascus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..462
                     /product="caspase-1 isoform X1"
                     /calculated_mol_wt="51972"
ORIGIN
        1 mdtlpdagys sdslmgrkik iciymhseyf pirvqalpkr kakhtfsfsh trreerkama
       61 dkvlkekrkl firsmgegti nglldellqt rvlnqeemek vkrenatvmd ktralidsvt
      121 pkgaqacqic ityiceedky laqtlglsad qtsgnslnmq dsqgvlssfp alqavqdnpt
      181 mptssgsegn vklcsleeaq riwkekpvei ypimdkssrt rlaliicnee fdtlprrtga
      241 evdiagmtml lqnlgysvdv kknlaasdmt telqafahrp ehktsdstfl vfmshgileg
      301 icgkkhseqv pdvlqlnaif kmlntkncps lkdkpkviii qacrgdgygv vwlkdsagvs
      361 rnvslpttee feddaikkah iekdfiafcs stpdnvswrh ptkgsvfimr liehlqeyac
      421 scdveeifrk vrfsfeqpdg raqmptterv tltrcfylfp gh
//
LOCUS       XP_012950190             482 aa            linear   VRT 24-MAY-2017
DEFINITION  caspase-8 isoform X1 [Anas platyrhynchos].
ACCESSION   XP_012950190
VERSION     XP_012950190.1
DBLINK      BioProject: PRJNA208071
KEYWORDS    RefSeq.
SOURCE      Anas platyrhynchos (mallard)
  ORGANISM  Anas platyrhynchos
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Anseriformes;
            Anatidae; Anas.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..482
                     /product="caspase-8 isoform X1"
                     /calculated_mol_wt="54406"
ORIGIN
        1 mefsrqlyai sealdraela alkflslehv pvrkqeaiee pkaffqvlle kgmieagdla
       61 flrellyrin rmdllaaqlg ssreemerel qipgrarvsq fryllfqlse nitkeemkcf
      121 kfllgkelpk cklspettml dvfiemekkg ilgednltvl ktlcekidks llkkieeyel
      181 nlfgeeemlv tegqrsstev teacprllas svardspgsc dqssqleayk mtsrprgvcl
      241 ilnnhnfaka rkavpelkkm ndrngtdvda aalskvfgtl hfiikeykdl taeeirkivn
      301 iyrcqdhndk dcfvccvlsh gkkgviygvd gqevriqelt tsftgqnchs lagkpkvffv
      361 qacqgdarqk gvtietdsge qdssleadar fqlecipsea dfllgmatlq dyvsyrsssq
      421 gswyiqslcq hlenscprge diltiltavn qevsrkidkq naakqmpqps ftlrkklifp
      481 vn
//
LOCUS       XP_015128433             266 aa            linear   VRT 04-JAN-2016
DEFINITION  PREDICTED: uncharacterized protein LOC776274 isoform X1 [Gallus
            gallus].
ACCESSION   XP_015128433
VERSION     XP_015128433.1
DBLINK      BioProject: PRJNA10808
KEYWORDS    RefSeq.
SOURCE      Gallus gallus (chicken)
  ORGANISM  Gallus gallus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Galliformes;
            Phasianidae; Phasianinae; Gallus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..266
                     /product="uncharacterized protein LOC776274 isoform X1"
                     /calculated_mol_wt="29124"
ORIGIN
        1 msrprqsral iivntdfcss dgdvglrprr garreaekls rvlaqlsyrv kllhnrtake
       61 medlyqqecs rehgdyfvsv isshgeegav lgcdcrplrl trifhivsaq ncpalaerpk
      121 vffiqacrga aldqgvfvet dsgqpepasf seylhippnt avmfacspgy gaflnpagsm
      181 flqallamla geerclalsr matrlnaava lgcqargtye gckqmpcfvt nlprdifpfs
      241 aqseplpstd tqggmeeeer qkptas
//

Usage

Usage example 1

A single species name is given, returning a single record

$: sb Caspase.gb -kt leucogenys

Output
LOCUS       XP_012351229             462 aa            linear   PRI 13-MAY-2015
DEFINITION  PREDICTED: caspase-1 isoform X1 [Nomascus leucogenys].
ACCESSION   XP_012351229
VERSION     XP_012351229.1
DBLINK      BioProject: PRJNA62133
KEYWORDS    RefSeq.
SOURCE      Nomascus leucogenys (northern white-cheeked gibbon)
  ORGANISM  Nomascus leucogenys
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hylobatidae; Nomascus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..462
                     /product="caspase-1 isoform X1"
                     /calculated_mol_wt="51972"
ORIGIN
        1 mdtlpdagys sdslmgrkik iciymhseyf pirvqalpkr kakhtfsfsh trreerkama
       61 dkvlkekrkl firsmgegti nglldellqt rvlnqeemek vkrenatvmd ktralidsvt
      121 pkgaqacqic ityiceedky laqtlglsad qtsgnslnmq dsqgvlssfp alqavqdnpt
      181 mptssgsegn vklcsleeaq riwkekpvei ypimdkssrt rlaliicnee fdtlprrtga
      241 evdiagmtml lqnlgysvdv kknlaasdmt telqafahrp ehktsdstfl vfmshgileg
      301 icgkkhseqv pdvlqlnaif kmlntkncps lkdkpkviii qacrgdgygv vwlkdsagvs
      361 rnvslpttee feddaikkah iekdfiafcs stpdnvswrh ptkgsvfimr liehlqeyac
      421 scdveeifrk vrfsfeqpdg raqmptterv tltrcfylfp gh
//
Usage example 2

Two species names are given, returning two records

$: sb Caspase.gb -kt leucogenys platyrhynchos

Output
LOCUS       XP_012351229             462 aa            linear   PRI 13-MAY-2015
DEFINITION  PREDICTED: caspase-1 isoform X1 [Nomascus leucogenys].
ACCESSION   XP_012351229
VERSION     XP_012351229.1
DBLINK      BioProject: PRJNA62133
KEYWORDS    RefSeq.
SOURCE      Nomascus leucogenys (northern white-cheeked gibbon)
  ORGANISM  Nomascus leucogenys
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hylobatidae; Nomascus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..462
                     /product="caspase-1 isoform X1"
                     /calculated_mol_wt="51972"
ORIGIN
        1 mdtlpdagys sdslmgrkik iciymhseyf pirvqalpkr kakhtfsfsh trreerkama
       61 dkvlkekrkl firsmgegti nglldellqt rvlnqeemek vkrenatvmd ktralidsvt
      121 pkgaqacqic ityiceedky laqtlglsad qtsgnslnmq dsqgvlssfp alqavqdnpt
      181 mptssgsegn vklcsleeaq riwkekpvei ypimdkssrt rlaliicnee fdtlprrtga
      241 evdiagmtml lqnlgysvdv kknlaasdmt telqafahrp ehktsdstfl vfmshgileg
      301 icgkkhseqv pdvlqlnaif kmlntkncps lkdkpkviii qacrgdgygv vwlkdsagvs
      361 rnvslpttee feddaikkah iekdfiafcs stpdnvswrh ptkgsvfimr liehlqeyac
      421 scdveeifrk vrfsfeqpdg raqmptterv tltrcfylfp gh
//
LOCUS       XP_012950190             482 aa            linear   VRT 24-MAY-2017
DEFINITION  caspase-8 isoform X1 [Anas platyrhynchos].
ACCESSION   XP_012950190
VERSION     XP_012950190.1
DBLINK      BioProject: PRJNA208071
KEYWORDS    RefSeq.
SOURCE      Anas platyrhynchos (mallard)
  ORGANISM  Anas platyrhynchos
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Anseriformes;
            Anatidae; Anas.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..482
                     /product="caspase-8 isoform X1"
                     /calculated_mol_wt="54406"
ORIGIN
        1 mefsrqlyai sealdraela alkflslehv pvrkqeaiee pkaffqvlle kgmieagdla
       61 flrellyrin rmdllaaqlg ssreemerel qipgrarvsq fryllfqlse nitkeemkcf
      121 kfllgkelpk cklspettml dvfiemekkg ilgednltvl ktlcekidks llkkieeyel
      181 nlfgeeemlv tegqrsstev teacprllas svardspgsc dqssqleayk mtsrprgvcl
      241 ilnnhnfaka rkavpelkkm ndrngtdvda aalskvfgtl hfiikeykdl taeeirkivn
      301 iyrcqdhndk dcfvccvlsh gkkgviygvd gqevriqelt tsftgqnchs lagkpkvffv
      361 qacqgdarqk gvtietdsge qdssleadar fqlecipsea dfllgmatlq dyvsyrsssq
      421 gswyiqslcq hlenscprge diltiltavn qevsrkidkq naakqmpqps ftlrkklifp
      481 vn
//
Usage example 3

A single deeper taxonomic designation is given, returning two records

$: sb Caspase.gb -kt Aves

Output
LOCUS       XP_012950190             482 aa            linear   VRT 24-MAY-2017
DEFINITION  caspase-8 isoform X1 [Anas platyrhynchos].
ACCESSION   XP_012950190
VERSION     XP_012950190.1
DBLINK      BioProject: PRJNA208071
KEYWORDS    RefSeq.
SOURCE      Anas platyrhynchos (mallard)
  ORGANISM  Anas platyrhynchos
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Anseriformes;
            Anatidae; Anas.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..482
                     /product="caspase-8 isoform X1"
                     /calculated_mol_wt="54406"
ORIGIN
        1 mefsrqlyai sealdraela alkflslehv pvrkqeaiee pkaffqvlle kgmieagdla
       61 flrellyrin rmdllaaqlg ssreemerel qipgrarvsq fryllfqlse nitkeemkcf
      121 kfllgkelpk cklspettml dvfiemekkg ilgednltvl ktlcekidks llkkieeyel
      181 nlfgeeemlv tegqrsstev teacprllas svardspgsc dqssqleayk mtsrprgvcl
      241 ilnnhnfaka rkavpelkkm ndrngtdvda aalskvfgtl hfiikeykdl taeeirkivn
      301 iyrcqdhndk dcfvccvlsh gkkgviygvd gqevriqelt tsftgqnchs lagkpkvffv
      361 qacqgdarqk gvtietdsge qdssleadar fqlecipsea dfllgmatlq dyvsyrsssq
      421 gswyiqslcq hlenscprge diltiltavn qevsrkidkq naakqmpqps ftlrkklifp
      481 vn
//
LOCUS       XP_015128433             266 aa            linear   VRT 04-JAN-2016
DEFINITION  PREDICTED: uncharacterized protein LOC776274 isoform X1 [Gallus
            gallus].
ACCESSION   XP_015128433
VERSION     XP_015128433.1
DBLINK      BioProject: PRJNA10808
KEYWORDS    RefSeq.
SOURCE      Gallus gallus (chicken)
  ORGANISM  Gallus gallus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Galliformes;
            Phasianidae; Phasianinae; Gallus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..266
                     /product="uncharacterized protein LOC776274 isoform X1"
                     /calculated_mol_wt="29124"
ORIGIN
        1 msrprqsral iivntdfcss dgdvglrprr garreaekls rvlaqlsyrv kllhnrtake
       61 medlyqqecs rehgdyfvsv isshgeegav lgcdcrplrl trifhivsaq ncpalaerpk
      121 vffiqacrga aldqgvfvet dsgqpepasf seylhippnt avmfacspgy gaflnpagsm
      181 flqallamla geerclalsr matrlnaava lgcqargtye gckqmpcfvt nlprdifpfs
      241 aqseplpstd tqggmeeeer qkptas
//
Usage example 4

Sometimes you may want to constrain a search to matching ALL taxonomic designations (e.g, if there are multiple distinct species with the same species name, you may want to match genus AND species). To do so, pipe together multiple calls to --keep_taxa.

$: sb Caspase.gb -kt Galloanserae | sb -kt gallus

Output
LOCUS       XP_015128433             266 aa            linear   VRT 04-JAN-2016
DEFINITION  PREDICTED: uncharacterized protein LOC776274 isoform X1 [Gallus
            gallus].
ACCESSION   XP_015128433
VERSION     XP_015128433.1
DBLINK      BioProject: PRJNA10808
KEYWORDS    RefSeq.
SOURCE      Gallus gallus (chicken)
  ORGANISM  Gallus gallus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
            Coelurosauria; Aves; Neognathae; Galloanserae; Galliformes;
            Phasianidae; Phasianinae; Gallus.
COMMENT     COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     Protein         1..266
                     /product="uncharacterized protein LOC776274 isoform X1"
                     /calculated_mol_wt="29124"
ORIGIN
        1 msrprqsral iivntdfcss dgdvglrprr garreaekls rvlaqlsyrv kllhnrtake
       61 medlyqqecs rehgdyfvsv isshgeegav lgcdcrplrl trifhivsaq ncpalaerpk
      121 vffiqacrga aldqgvfvet dsgqpepasf seylhippnt avmfacspgy gaflnpagsm
      181 flqallamla geerclalsr matrlnaava lgcqargtye gckqmpcfvt nlprdifpfs
      241 aqseplpstd tqggmeeeer qkptas
//

Main Toolkit Pages





Further Reading

Clone this wiki locally