Skip to content

ykat0/bivartect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Bivartect

DOI

Accurate and memory-saving breakpoint detection by direct read comparison

Last updated: 2021-04-15

We present Bivartect, a genomic structural variant caller that directly compares sequence reads generated by high-throughput sequencing. Bivartect achieves memory saving by keeping only a small part of the suffixes of input reads in memory. Using simulated benchmark data and real genome editing data, Bivartect outperformed the state-of-the-art small variant callers in low false positive detection of single nucleotide variants.

Installation

  • Bivartect (ver. 1.1.10) (bivartect-1.1.10.tar.gz) in C++ program

Requirements

  • C++11 or later

Install on Linux and macOS

Type the followings in your terminal:

$ tar zxf bivartect-1.1.10.tar.gz
$ cd bivartect-1.1.10
$ ./configure

or

$ ./configure CXXFLAGS='-std=c++11 -pthread'

If you would like to install your local directory,

$ ./configure --prefix=/path/to/local_dir

Then,

$ make
$ sudo make install

Usage

For single-end reads:
$ bivartect -3 <normal.fastq> <tumor.fastq> <output.fastq>

For paired-end reads:
$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <output.fastq>

General options:
 -n     Path to the normal FASTQ (string [necessary])
 -N     Path to the normal reversed FASTQ (string)
 -m     Path to the mutated FASTQ (string [necessary])
 -M     Path to the mutated reversed FASTQ (string)
 -o     Path to the output FASTQ (string)
 -a     Output multi-FASTA instead of FASTQ (bool [false])
 -s     Input FASTQ is strand-specific (bool [false])
 -d     Filtering depth (int 10...32 [24])
 -c     Read count cutoff.
        In a breakpoint cluster, 
        IF max(predictedNormalReadCount, predictedMutatedReadCount) < c 
        THEN omit the breakpoint because of low quality. (int 1...100 [6])
 -x     Analysis division rate (int 1,4,16,64...1024 [64])
 -t     Using thread count. Set 0 to use hardware maximum threads (int 0... [0])
 -r     Path to the output detail overview text file (string)

Alias options:
 -2     = -n -m
 -3     = -n -m -o
 -4     = -n -N -m -M
 -5     = -n -N -m -M -o

Examples:
$ bivartect -x 16 -d 30 -c 6 -n <normal.fastq> -m <tumor.fastq> -o <output.fastq>
$ bivartect -3 <normal.fastq> <tumor.fastq> <output.fastq> -c 4
$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <output.fastq>
$ bivartect -2 <normal.fastq> <tumor.fastq> -r <output.txt>

Pipeline

The standard use of Bivartect is illustrated with the following steps:

Step 1: run Bivartect to get consensus normal FASTQ reads whose mutated counterparts are predicted to have breakpoints

$ bivartect -5 <normal_1.fastq> <normal_2.fastq> <tumor_1.fastq> <tumor_2.fastq> <out.fastq>

Step 2: map FASTQ reads onto a reference genome with BWA-backtrack

$ bwa aln <index_prefix> <out.fastq> > <out.sai>
$ bwa samse -f <out.sam> <index_prefix> <out.sai> <out.fastq>

Step 3: convert SAM alignments into predicted VCF variants with their genomic locations

$ ./sam2vcf.py <out.sam> <reference.fa.gz> > <out.vcf> 

Data

  • Simulated benchmark FASTQ data used in this work are available HERE.

Reference

Keisuke Shimmura, Yuki Kato and Yukio Kawahara, Bivartect: accurate and memory-saving breakpoint detection by direct read comparison, Bioinformatics, vol. 36, issue 9, pp. 2725–2730, 2020. [Link]


If you have any questions, please contact Yuki Kato
Graduate School of Medicine, Osaka University, Japan

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages