Skip to content

Workflows for genome alignment chain file selection and annotation tasks

License

Notifications You must be signed in to change notification settings

BrendelGroup/GAinSAW

Repository files navigation

GAinSAW - Genome Alignment chain file Selection and Annotation Workbench

Continuing improvements in DNA sequencing technologies and whole genome assembly strategies are enabling large-scale comparative genomics studies that probe how regions in one ("query") genome map to other ("target") genomes. Because of the large size of the genomes (compared to sequences in traditional multiple sequence alignments) and potentially complex genome rearrangements, including translocations, inversions, and duplications, such studies pose difficult computational problems. Progressive Cactus provides a state-of-the-art solution.

Results of pairwise genome alignments are typically represented in the UCSC Chain Format. The UCSC LiftOver Tool can be used to map query genome coordinates to target genome coordinates, as long as the query coordinates fall into an alignment block in the input chain file. The figure below depicts points that can be lifted (long arrows) and points that fall into query-unique segments (short arrows).

PoinSetConservation

While successful as a tool to discover genomic synteny, chained alignments are not recommended for SNP liftover. Necessarily, chained alignments represent best-guess global alignments. Determination of orthologous points will only be confident if the local alignments around the points are unambiguous.

GAinSAW was developed as a tool to liftover sets of points from a query genome to a target genome and then filter the points with respect to annotation and alignment confidence criteria. The code conforms to our RAMOSE philosophy: it generates reproducible, accurate, and meaningful results; it is open (source) and designed to be scalable and easy to use.

Quick Start

The simplest way to get going is to use the GAinSAW Singularity container available from our Singularity Hub site; e.g.:

cd
git clone https://github.com/vpbrendel/GAinSAW
cd GAinSAW
wget https://BrendelGroup.org/SingularityHub/GAinSAW.sif
alias rws="singularity exec -e -B ~/GAinSAW  ~/GAinSAW/GAinSAW.sif"
rws xgainsaw -h

In the above example, you clone this repository into your Linux home directory, go into the thus created GAinSAW directory, download the GAinSAW Singularity container, define the bash alias rws ("run with singularity"), and check that everything works by showing help information for xgainsaw.

Of course this assumes that you have Apptainer/Singularity installed on your system. Check whether there is a package built for your system. Otherwise, follow the instructions to install Singularity from source code.

Realistic Start

Please find detailed installation instructions and options in the INSTALL document. Once all preparatory steps are taken care of, see the HOWTO document for usage examples.

Reference

Wenshu Chen and Volker P. Brendel (2023) GAinSAW: Genome Alignment chain file Selection and Annotation Workbench; in preparation

Contact

Please direct all comments and suggestions to Wenshu Chen or Volker Brendel at Indiana University.

About

Workflows for genome alignment chain file selection and annotation tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published