Task_Jan_20

A small exercise to get familiar with how sequences and alignments are stored / represented

align_seq.sh Script Overview

Align sequences by passing the a fasta file to be aligned as the first argument
The program will ask the user to choose between fasta and clustal format
A file is created for the aligned sequence in their format of choice
The file will be have the same base name as their input file
_aln.fasta will be appended if they requested fasta format
.aln will be appended if they requested clustal format

Demo Files

Three sequence files appear in the repository:

FOXP2.fasta A file of sequences downloaded from the NCBI
FOXP2_aln.fasta The output of running "align.seq.sh FOXP2" and not opting for clustal format
FOXP2.aln The output of running "align.seq.sh FOXP2" and opting for clustal format

Storage of Fasta and clustal files

Fasta
- Starts with a header line which always begins with > followed by an identifier
- Then the sequence
Clustal
- Header line which describes the alignment
- An identifier at the beginning of each line, and the aligned sequences on the right
- Each column represents the same position in all of the sequences
- The --- represents missing data in those positons of a sequence
- The goal is to identify conserved regions and variable regions

Setup Instructions

1. Python environment setup and installing dependencies

To keep dependencies isolated, create a virtual environment:

# Create a virtual environment in the .venv folder
python -m venv .venv

# activate the virtual environment
source .venv/bin/activate

# Install a package when you are in the virtual env
pip install -r requirements.txt

# deactivate it
deactivate

2. MAFFT Installation

MAFFT is required to use the sequence alignment tools in this project. You can install MAFFT on macOS via Homebrew:

# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install MAFFT
brew install mafft

How MAFFT Aligns Circularly Permutated Sequences

The following example demonstrates how MAFFT fails to produce the best alignment of circularly permutated sequences

Input

> seq1
ACGTAAATTAAA
> seq2
AAACGTAAATTA

Output:

seq1            --acgtaaattaaa
seq2            aaacgtaaatta--
                  **********

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
FOXP2.aln		FOXP2.aln
FOXP2.fasta		FOXP2.fasta
FOXP2_aln.fasta		FOXP2_aln.fasta
LICENSE		LICENSE
README.md		README.md
align_seq.sh		align_seq.sh
bad_input.fasta		bad_input.fasta
circ_align.py		circ_align.py
input.fasta		input.fasta
issue3.py		issue3.py
multi_seq_align.py		multi_seq_align.py
output.fasta		output.fasta
requirements.txt		requirements.txt
simplified_msa.py		simplified_msa.py
test_alignment.py		test_alignment.py
test_circ_align.py		test_circ_align.py
test_msa.py		test_msa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Task_Jan_20

align_seq.sh Script Overview

Demo Files

Storage of Fasta and clustal files

Setup Instructions

1. Python environment setup and installing dependencies

2. MAFFT Installation

How MAFFT Aligns Circularly Permutated Sequences

The following example demonstrates how MAFFT fails to produce the best alignment of circularly permutated sequences

Input

Output:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Cassiebastress/Task_Jan_20

Folders and files

Latest commit

History

Repository files navigation

Task_Jan_20

align_seq.sh Script Overview

Demo Files

Storage of Fasta and clustal files

Setup Instructions

1. Python environment setup and installing dependencies

2. MAFFT Installation

How MAFFT Aligns Circularly Permutated Sequences

The following example demonstrates how MAFFT fails to produce the best alignment of circularly permutated sequences

Input

Output:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages