Bivartect

Accurate and memory-saving breakpoint detection by direct read comparison

Last updated: 2021-04-15

We present Bivartect, a genomic structural variant caller that directly compares sequence reads generated by high-throughput sequencing. Bivartect achieves memory saving by keeping only a small part of the suffixes of input reads in memory. Using simulated benchmark data and real genome editing data, Bivartect outperformed the state-of-the-art small variant callers in low false positive detection of single nucleotide variants.

Installation

Bivartect (ver. 1.1.10) (bivartect-1.1.10.tar.gz) in C++ program

Requirements

C++11 or later

Install on Linux and macOS

Type the followings in your terminal:

$ tar zxf bivartect-1.1.10.tar.gz
$ cd bivartect-1.1.10
$ ./configure

or

$ ./configure CXXFLAGS='-std=c++11 -pthread'

If you would like to install your local directory,

$ ./configure --prefix=/path/to/local_dir

Then,

$ make
$ sudo make install

Usage

For single-end reads:
$ bivartect -3 <normal.fastq> <tumor.fastq> <output.fastq>

For paired-end reads:
$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <output.fastq>

General options:
 -n     Path to the normal FASTQ (string [necessary])
 -N     Path to the normal reversed FASTQ (string)
 -m     Path to the mutated FASTQ (string [necessary])
 -M     Path to the mutated reversed FASTQ (string)
 -o     Path to the output FASTQ (string)
 -a     Output multi-FASTA instead of FASTQ (bool [false])
 -s     Input FASTQ is strand-specific (bool [false])
 -d     Filtering depth (int 10...32 [24])
 -c     Read count cutoff.
        In a breakpoint cluster, 
        IF max(predictedNormalReadCount, predictedMutatedReadCount) < c 
        THEN omit the breakpoint because of low quality. (int 1...100 [6])
 -x     Analysis division rate (int 1,4,16,64...1024 [64])
 -t     Using thread count. Set 0 to use hardware maximum threads (int 0... [0])
 -r     Path to the output detail overview text file (string)

Alias options:
 -2     = -n -m
 -3     = -n -m -o
 -4     = -n -N -m -M
 -5     = -n -N -m -M -o

Examples:
$ bivartect -x 16 -d 30 -c 6 -n <normal.fastq> -m <tumor.fastq> -o <output.fastq>
$ bivartect -3 <normal.fastq> <tumor.fastq> <output.fastq> -c 4
$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <output.fastq>
$ bivartect -2 <normal.fastq> <tumor.fastq> -r <output.txt>

Pipeline

The standard use of Bivartect is illustrated with the following steps:

Step 1: run Bivartect to get consensus normal FASTQ reads whose mutated counterparts are predicted to have breakpoints

$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <out.fastq>

Step 2: map FASTQ reads onto a reference genome with BWA-backtrack

$ bwa aln <index_prefix> <out.fastq> > <out.sai>
$ bwa samse -f <out.sam> <index_prefix> <out.sai> <out.fastq>

Step 3: convert SAM alignments into predicted VCF variants with their genomic locations

$ ./sam2vcf.py <out.sam> <reference.fa.gz> > <out.vcf>

Data

Simulated benchmark FASTQ data used in this work are available HERE.

Reference

Keisuke Shimmura, Yuki Kato and Yukio Kawahara, Bivartect: accurate and memory-saving breakpoint detection by direct read comparison, Bioinformatics, vol. 36, issue 9, pp. 2725–2730, 2020. [Link]

If you have any questions, please contact Yuki Kato
Graduate School of Medicine, Osaka University, Japan

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
README.md		README.md
bivartect-1.1.10.tar.gz		bivartect-1.1.10.tar.gz
sam2vcf.py		sam2vcf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bivartect

Accurate and memory-saving breakpoint detection by direct read comparison

Installation

Requirements

Install on Linux and macOS

Usage

Pipeline

Step 1: run Bivartect to get consensus normal FASTQ reads whose mutated counterparts are predicted to have breakpoints

Step 2: map FASTQ reads onto a reference genome with BWA-backtrack

Step 3: convert SAM alignments into predicted VCF variants with their genomic locations

Data

Reference

About

Releases 1

Packages

Languages

ykat0/bivartect

Folders and files

Latest commit

History

Repository files navigation

Bivartect

Accurate and memory-saving breakpoint detection by direct read comparison

Installation

Requirements

Install on Linux and macOS

Usage

Pipeline

Step 1: run Bivartect to get consensus normal FASTQ reads whose mutated counterparts are predicted to have breakpoints

Step 2: map FASTQ reads onto a reference genome with BWA-backtrack

Step 3: convert SAM alignments into predicted VCF variants with their genomic locations

Data

Reference

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages