This repository contains scripts that can be used to reproduce analysis reported in paper describing MAGERI software. Raw sequencing datasets can be downloaded from SRA, accession PRJNA297719
The following sample sets were used in the original manuscript:
p127
- negative control, normal blood DNA (two samples)p126-2
andp126-3
- positive control samples, each containing variants described inh4_hd734_variants.vcf
at a ~0.1% ratep92
- matched blood plasma and tumor samples from two patientsduplex
- data from duplex sequencing protocol as reported by Schmitt MW et al. Nat Met 2015, available in SRAhiv
- data from UMI-tagged HIV sequencing as reported by Zhou S et al. J Virol 2015, available in SRA
The process/
folder contains instructions for data download and pre-processing. Other folders in the repository root contain MAGERI results for analyzed datasets.
To reproduce figures reported in present paper run:
groovy GenerateTable.groovy
Rscript main_text.R
Rscript suppl.R
Requires both R and Groovy to be installed for running.
Please cite Shugay et al. Plos Comp Biol 2017 in case you use our datasets.