Skip to content

adcosta17/BreakPointAssembly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

BreakPointAssembly

A tool to quickly assembly SV breakpoints using Long Reads

The bp_assemble.py script uses samtools, minimap2 and racon to assemble and polish a list of candidate SV breakpoints. Taking a tsv list of breakpoint positions as input along with the read fastq and bam the script follows 5 steps:

  1. Extract reads at the breakpoint positions
  2. Find reads that support and span the breakpoint on both chromosome copies
  3. Generate scaffold breakpoint sequences using the longest reads that support each arm
  4. Align all reads at breakpoint positions to the scaffolds
  5. Polish the scaffold sequence using racon

The script assumes that the reads are zipped and indexed by bgzip

Dependencies

  • pysam & samtools
  • mappy & minimap2
  • bgzip
  • racon

Installing the dependencies through a conda environment is recommended, however installation from source will work as well

Setup, Usage

# Setup:
git clone https://github.com/adcosta17/BreakPointAssembly.git
cd BreakPointAssembly

# Usage: 
python bp_assemble.py --sniffles-input <sniffles_translocation_calls.tsv> \
--input-bam <input.bam> \
--input-fastq <input.fastq.gz> \
--output-folder <path/to/output/folder> \
--reference-genome <reference_genome.fa>

Arguments:

--sniffles-input A tsv of SV calls. 6 columns are needed: chromsome_A, start, end, chromsome_B, start, end

--input-bam A bam file containing alignments of reads to the reference genome

--input-fastq A fastq of the reads, zipped and indexed by bgzip

--reference-genome A reference genome fasta

--racon [Optional] The path to racon. By default assumes racon is in the PATH

--cleanup [Optional Flag] Cleanup temp files generated by the script

--output-bam [Optional Flag] Request that the assembled breakpoints be aligned to the reference genome and a BAM be written with their alignments.

--small-window [Optional Flag] Gets the sequence within a small window of the breakpoint rather than a full assembly of the region

--bp-window [Optional] The window size for the small window around the breakpoint if --small-window is specified [150 bp]

About

A tool to quickly assembly SV breakpoints

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages