LevioSAM2 lifts over alignments accurately and efficiently using a chain file.
- Converting aligned short and long reads records (in SAM/BAM/CRAM format) from one reference to another
- Comprehensive alignment feature updating during lift-over:
- Reference name (
RNAME
), position (POS
), alignmant flag (FLAG
), and CIGAR alignment string (CIGAR
) - Mate read information (
RNEXT
,PNEXT
,TLEN
) - (optional) Alignment tags (
MD:Z
,NM:i
)
- Reference name (
- Multithreading support
- Toolkit for "selective" pipelines which consider major changes between the source and target references
- (beta) Converting intervals (in BED format) from one reference to another
LevioSAM2 can be installed using:
# The following commands install leviosam2 in a new conda environment called `leviosam2`
conda create -n leviosam2
conda activate leviosam2
conda install -c bioconda -c conda-forge leviosam2
docker pull naechyun/leviosam2:latest
singularity pull docker://naechyun/leviosam2:latest
- Built from source using CMake. See INSTALL.md for details.
LevioSAM2 performs lift-over using a chain file as the lift-over map. Many chain files are provided by the UCSC Genome Browser, e.g. GRCh38-related chains. For other reference pairs, common ways to generate chain files include using the UCSC recipe and nf-LO.
LevioSAM2 indexes a chain file for lift-over queries. The resulting index has a .clft
extension.
leviosam2 index -c source_to_target.chain -p source_to_target -F target.fai
LevioSAM2-lift
is the lift-over kernel of the levioSAM2 toolkit.
The levioSAM2 ChainMap index will be saved to source_to_target.clft
. The output will be saved to lifted_from_source.bam
.
We highly recommend to sort the input BAM by position prior to running levioSAM2-lift.
leviosam2 lift -C source_to_target.clft -a aligned_to_source.bam -p lifted_from_source -O bam
The levioSAM2 workflow includes lift-over using the leviosam2-lift
kernel and a selective re-mapping strategy. This approach can improve accuracy.
Example:
# You may skip the indexing step if you've already run it
leviosam2 index -c source_to_target.chain -p source_to_target -F target.fai
sh leviosam2.sh \
-a bowtie2 -A -10 -q 10 -H 5 \
-i aligned_to_source.bam \
-o aligned_to_source-lifted \
-f target.fna \
-b bt2/target \
-C source_to_target.clft \
-t 16
See this README to learn more about running the full levioSAM2 workflow.
- Nae-Chyun Chen, Luis Paulin, Fritz Sedlazeck, Sergey Koren, Adam Phillippy, Ben Langmead. Improved sequence mapping using a complete reference genome and lift-over. Nat Methods (2023). https://doi.org/10.1038/s41592-023-02069-6
- Taher Mun, Nae-Chyun Chen, Ben Langmead. LevioSAM: Fast lift-over of variant-aware reference alignments, Bioinformatics, 2021;, btab396, https://doi.org/10.1093/bioinformatics/btab396
Logo credit: Ting-Wei Young