|
| 1 | + |
| 2 | +***************************************************************************** |
| 3 | + EVALYN -- EVolved ALYNments |
| 4 | +***************************************************************************** |
| 5 | + |
| 6 | +Copyright (C) 2006 Luke Sheneman |
| 7 | + |
| 8 | +A GA for iteratively refining guide trees by evolutionary |
| 9 | +computation for use in progressive multiple sequence alignment as |
| 10 | +presented in: |
| 11 | + |
| 12 | + Sheneman, L., J.A. Foster (2004) Evolving Better Multiple Sequence |
| 13 | + Alignments, Proceedings of the Genetic and Evolutionary Computation |
| 14 | + Conference (GECCO 2004), Seattle, WA. |
| 15 | + |
| 16 | +***************************************************************************** |
| 17 | + |
| 18 | +This version of EVALYN is not quite ready for public consumption. I have |
| 19 | +no useful help messages or documentation. There is some critical missing |
| 20 | +functionality as well, such as the ability to properly handle ambiguity |
| 21 | +codes in input sequences. These will be added in future releases. |
| 22 | + |
| 23 | +Basically, EVALYN will read DNA or protein sequences in FASTA format |
| 24 | +and will output an alignment in Clustal W (*.ALN) format. It will also output |
| 25 | +the best guide tree in Newick format. Essentially, EVALYN maintains a |
| 26 | +population of guide trees and iteratively evolves guide trees to improve |
| 27 | +alignments as measured by a sum-of-pairs fitness function. |
| 28 | + |
| 29 | + |
| 30 | +TO BUILD EVALYN: |
| 31 | +---------------- |
| 32 | + |
| 33 | +cd ./ltree |
| 34 | +make |
| 35 | + |
| 36 | +Executable is called "ltree", and will reside in the <evalyndist>/ltree |
| 37 | +directory. |
| 38 | + |
| 39 | + |
| 40 | +EXAMPLE USAGE: |
| 41 | +-------------- |
| 42 | + |
| 43 | +./ltree --infile=proteins.fasta --population=1000 --iterations=10000 \ |
| 44 | + --matrix=blosum62.txt --gapopen=-1.0 --gapextend=-0.1 --rnj |
| 45 | + |
| 46 | +Example substitution matrix formats are shown in <evalyndist>/ltree as: |
| 47 | +"def_dna_matrix.txt" and "def_pro_matrix.txt", which are also the |
| 48 | +default substitution matrices if none are explicitly specified. |
| 49 | + |
| 50 | + |
| 51 | +COMMON COMMAND-LINE FLAGS: |
| 52 | +-------------------------- |
| 53 | + |
| 54 | +** Inputs requiring arguments: |
| 55 | + |
| 56 | + --infile = <fasta-formatted input file> |
| 57 | + --outfile = <name of output file> |
| 58 | + --treefile = <name of output Newick-formatted tree file> |
| 59 | + --matrix = <name of substitution matrix> |
| 60 | + --population = <population size, ex. "--population = 500" > |
| 61 | + --iterations = <number of iterations to run, ex. "--iterations = 1000"> |
| 62 | + --converge = <program stops when it converges when no improvement in x steps> |
| 63 | + --mutation = <mutation rate, ex. "--mutation=0.01"> |
| 64 | + --gapopen = <cost of opening a gap region, ex. "--gapopen=-4.0"> |
| 65 | + --gapextend = <cost of extending a gap region, ex. "--gapextend=-1.0"> |
| 66 | + --seed = <the random number generated seedm ex, "--seed = 1000"> |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | +** Inputs requiring NO arguments |
| 71 | + |
| 72 | + --rnj : seeds population with a relaxed neighbod joining tree> |
| 73 | + --dna : specifies that input is a DNA sequence |
| 74 | + --protein : specifies that input is a protein sequence |
| 75 | + |
| 76 | +**************************************************************************** |
| 77 | + |
| 78 | +Please direct questions to: |
| 79 | + |
| 80 | +Luke Sheneman |
| 81 | +sheneman@cs.uidaho.edu |
| 82 | + |
| 83 | +University of Idaho |
| 84 | + |
| 85 | +**************************************************************************** |
| 86 | + |
0 commit comments