Skip to content

bigdatagenomics/avocado

Folders and files

NameName
Last commit message
Last commit date
Jun 9, 2018
Jul 9, 2018
Jun 9, 2018
Jun 5, 2017
Jun 9, 2018
Jun 9, 2018
Jul 29, 2014
Aug 20, 2014
Oct 26, 2017
Apr 4, 2014
Sep 10, 2013
Jun 26, 2014
Jan 5, 2018
Jun 9, 2018

Repository files navigation

avocado

Coverage Status

A Variant Caller, Distributed

This README represents the TL;DR docs for avocado. More detailed documentation is hosted at Read the Docs.

Who/What/When/Where/Why avocado?

Avocado is a distributed variant caller built on top of the ADAM format and APIs and Apache Spark. Avocado is an open source project and is released under the Apache 2.0 license.

Avocado can be used for single sample germline variant calling, trio calling, and joint variant calling. Avocado has >99% SNP calling accuracy, and >96% INDEL calling accuracy when paired with ADAM's INDEL realignment pipeline. When run on a single 32 core machine, Avocado can call variants on a 60x coverage whole genome sequencing (WGS) dataset in approximately 7 hours. By using Apache Spark to scale across multiple machines, Avocado can process the same WGS dataset in approximately 15 minutes when using 1,024 cores.

How avocado?

Building Avocado

Avocado uses Maven to build. To build avocado, cd into the repository and run "mvn package".

Avocado binaries

Nightly builds of Avocado are available from the OSS Sonatype repository. Additionally, we make a Docker image available from Quay.

License

ADAM is released under the Apache License, Version 2.0.

Citing Avocado

Avocado has been described in a PhD thesis. To cite this thesis, please cite:

@article{nothaft17,
  title={Scalable Systems and Algorithms for Genomic Variant Analysis},
  author={Nothaft, Frank Austin},
  school = {EECS Department, University of California, Berkeley},
  uear = {2017},
  month = {Dec},
  URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-204.html},
  number = {UCB/EECS-2017-204}
}

A preprint describing Avocado should be released by the end of January 2018.