BreakPointAssembly

A tool to quickly assembly SV breakpoints using Long Reads

The bp_assemble.py script uses samtools, minimap2 and racon to assemble and polish a list of candidate SV breakpoints. Taking a tsv list of breakpoint positions as input along with the read fastq and bam the script follows 5 steps:

Extract reads at the breakpoint positions
Find reads that support and span the breakpoint on both chromosome copies
Generate scaffold breakpoint sequences using the longest reads that support each arm
Align all reads at breakpoint positions to the scaffolds
Polish the scaffold sequence using racon

The script assumes that the reads are zipped and indexed by bgzip

Dependencies

pysam & samtools
mappy & minimap2
bgzip
racon

Installing the dependencies through a conda environment is recommended, however installation from source will work as well

Setup, Usage

# Setup:
git clone https://github.com/adcosta17/BreakPointAssembly.git
cd BreakPointAssembly

# Usage: 
python bp_assemble.py --sniffles-input <sniffles_translocation_calls.tsv> \
--input-bam <input.bam> \
--input-fastq <input.fastq.gz> \
--output-folder <path/to/output/folder> \
--reference-genome <reference_genome.fa>

Arguments:

--sniffles-input A tsv of SV calls. 6 columns are needed: chromsome_A, start, end, chromsome_B, start, end

--input-bam A bam file containing alignments of reads to the reference genome

--input-fastq A fastq of the reads, zipped and indexed by bgzip

--reference-genome A reference genome fasta

--racon [Optional] The path to racon. By default assumes racon is in the PATH

--cleanup [Optional Flag] Cleanup temp files generated by the script

--output-bam [Optional Flag] Request that the assembled breakpoints be aligned to the reference genome and a BAM be written with their alignments.

--small-window [Optional Flag] Gets the sequence within a small window of the breakpoint rather than a full assembly of the region

--bp-window [Optional] The window size for the small window around the breakpoint if --small-window is specified [150 bp]

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
LICENSE.txt		LICENSE.txt
README.md		README.md
bp_assemble.py		bp_assemble.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BreakPointAssembly

Dependencies

Setup, Usage

Arguments:

About

Releases

Packages

Languages

License

adcosta17/BreakPointAssembly

Folders and files

Latest commit

History

Repository files navigation

BreakPointAssembly

Dependencies

Setup, Usage

Arguments:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages