This repository contains a list of publications pertinent to the Genome in a Bottle project
-
Curated variation benchmarks for challenging medically relevant autosomal genes, Nature Biotechnology, 07 February 2022.
-
Chin CS, Wagner J, Zeng Q, Garrison E, Garg S, Fungtammasan A, Rautiainen M, Aganezov S, Kirsche M, Zarate S, Schatz MC, Xiao C, Rowell WJ, Markello C, Farek J, Sedlazeck FJ, Bansal V, Yoo B, Miller N, Zhou X, Carroll A, Barrio AM, Salit M, Marschall T, Dilthey AT, Zook JM. A diploid assembly-based benchmark for variants in the major histocompatibility complex. Nat Commun. 2020 Sep 22;11(1):4794. doi: 10.1038/s41467-020-18564-9.
-
Zook JM, Hansen NF, Olson ND, Chapman L, Mullikin JC, Xiao C, Sherry S, Koren S, Phillippy AM, Boutros PC, Sahraeian SME, Huang V, Rouette A, Alexander N, Mason CE, Hajirasouliha I, Ricketts C, Lee J, Tearle R, Fiddes IT, Barrio AM, Wala J, Carroll A, Ghaffari N, Rodriguez OL, Bashir A, Jackman S, Farrell JJ, Wenger AM, Alkan C, Soylev A, Schatz MC, Garg S, Church G, Marschall T, Chen K, Fan X, English AC, Rosenfeld JA, Zhou W, Mills RE, Sage JM, Davis JR, Kaiser MD, Oliver JS, Catalano AP, Chaisson MJP, Spies N, Sedlazeck FJ, Salit M. A robust benchmark for detection of germline large insertions and deletions. Nat. Biotechnol. 10.1038/s41587-020-0538-8 (2020).
-
Chapman LM, Spies N, Pai P, Lim CS, Carroll A, Narzisi G, Watson CM, Proukakis C, Clarke WE, Nariai N, Dawson E, Jones G, Blankenberg D, Brueffer C, Xiao C, Kolora SRR, Alexander N, Wolujewicz P, Ahmed AE, Smith G, Shehreen S, Wenger AM, Salit M, Zook JM. A crowdsourced set of curated structural variants for the human genome. PLoS Comput Biol. 2020 Jun 19;16(6):e1007933. doi: 10.1371/journal.pcbi.1007933. eCollection 2020 Jun. PMID: 32559231
-
Justin M. Zook, Nancy F. Hansen, Nathan D. Olson, Lesley M. Chapman, View ORCID ProfileJames C. Mullikin, Chunlin Xiao, Stephen Sherry, Sergey Koren, Adam M. Phillippy, View ORCID ProfilePaul C. Boutros, Sayed Mohammad E. Sahraeian, Vincent Huang, Alexandre Rouette, Noah Alexander, Christopher E. Mason, Iman Hajirasouliha, Camir Ricketts, Joyce Lee, Rick Tearle, Ian T. Fiddes, Alvaro Martinez Barrio, Jeremiah Wala, Andrew Carroll, Noushin Ghaffari, Oscar L. Rodriguez, Ali Bashir, View ORCID ProfileShaun Jackman, John J Farrell, Aaron M Wenger, View ORCID ProfileCan Alkan, Arda Soylev, Michael C. Schatz, Shilpa Garg, George Church, Tobias Marschall, Ken Chen, Xian Fan, Adam C. English, Jeffrey A. Rosenfeld, Weichen Zhou, Ryan E. Mills, Jay M. Sage, Jennifer R. Davis, Michael D. Kaiser, John S. Oliver, Anthony P. Catalano, Mark JP Chaisson, Noah Spies, Fritz J. Sedlazeck, Marc Salit, the Genome in a Bottle Consortium. A robust benchmark for germline structural variant detection. https://www.biorxiv.org/content/10.1101/664623v2
-
Ying-Chih Wang, Nathan D. Olson, Gintaras Deikus, Hardik Shah, Aaron M. Wenger, Jonathan Trow, Chunlin Xiao, Stephen Sherry, Marc L. Salit, Justin M. Zook, Melissa Smith & Robert Sebra, 2019, High-coverage, long-read sequencing of Han Chinese trio reference samples, https://www.nature.com/articles/s41597-019-0098-2
-
Justin M. Zook, Jennifer McDaniel, Nathan D. Olson, Justin Wagner, Hemang Parikh, Haynes Heaton, Sean A. Irvine, Len Trigg, Rebecca Truty, Cory Y. McLean, Francisco M. De La Vega, Chunlin Xiao, Stephen Sherry & Marc Salit, 2019, An open resource for accurately benchmarking small variant and reference calls, https://www.nature.com/articles/s41587-019-0074-6
Publication describing the improved, reproducible methods used to form the widely used GIAB high-confidence small variant and reference calls (v3.3.2, released in Feb 2017) for 5 genomes, including 4 broadly-consented genomes from the Personal Genome Project. These methods cover ~90 % of GRCh37 and GRCh38, with 17 % more SNPs and 176 % more indels than our methods published in 2014.
- Justin Zook, Jennifer McDaniel, Hemang Parikh, Haynes Heaton, Sean A Irvine, Len Trigg, Rebecca Truty, Cory Y McLean, Francisco M De La Vega, Chunlin Xiao, Stephen Sherry, Marc Salit, Genome in a Bottle Consortium, Reproducible integration of multiple sequencing datasets to form high-confidence SNP, indel, and reference calls for five human genome reference materials. doi: https://doi.org/10.1101/281006, https://www.biorxiv.org/content/early/2018/05/25/281006
Publication describing best practices for benchmarking germline small variants using high-confidence calls and regions like those described in the first manuscript. The methods developed and standardized by the GA4GH Benchmarking Team enable sophisticated comparison of variant call files, output of standardized performance metrics, and stratification of performance by variant type and genome context.
- Peter Krusche, Len Trigg, Paul C Boutros, Christopher E Mason, Francisco M De La Vega, Benjamin L Moore, Mar Gonzalez-Porta, Michael A Eberle, Zivana Tezak, Samir Labadibi, Rebecca Truty, George Asimenos, Birgit Funke, Mark Fleharty, Brad A Chapman, Marc Salit, Justin M Zook, Global Alliance for Genomics and Health Benchmarking Team, Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes. https://doi.org/10.1101/270157, https://www.biorxiv.org/content/early/2018/05/24/270157
Publication describing data collected by GIAB for NA12878, the Ashkenazim trio, and the Chinese trio:
-
Justin M Zook, David Catoe, Jennifer McDaniel, Lindsay Vang, Noah Spies, Arend Sidow, Ziming Weng, Yuling Liu, Chris Mason, Noah Alexander, Dhruva Chandramohan, Elizabeth Henaff, Feng Chen, Erich Jaeger, Ali Moshrefi, Khoa Pham, William Stedman, Tiffany Liang, Michael Saghbini, Zeljko Dzakula, Alex Hastie, Han Cao, Gintaras Deikus, Eric Schadt, Robert Sebra, Ali Bashir, Rebecca M Truty, Christopher C Chang, Natali Gulbahce, Keyan Zhao, Srinka Ghosh, Fiona Hyland, Yutao Fu, Mark Chaisson, Jonathan Trow, Chunlin Xiao, Stephen T Sherry, Alexander W Zaranek, Madeleine Ball, Jason Bobe, Preston Estep, George M Church, Patrick Marks, Sofia Kyriazopoulou-Panagiotopoulou, Grace Zheng, Michael Schnall-Levin, Heather S Ordonez, Patrice A Mudivarti, Kristina Giorda, Marc Salit, Genome in a Bottle Consortium, Extensive sequencing of seven human genomes to characterize benchmark reference materials, Scientific Data 3, Article number: 160025 (2016) doi:10.1038/sdata.2016.25. http://www.nature.com/articles/sdata201625
-
Justin M Zook, David Catoe, Jennifer McDaniel, Lindsay Vang, Noah Spies, Arend Sidow, Ziming Weng, Yuling Liu, Chris Mason, Noah Alexander, Dhruva Chandramohan, Elizabeth Henaff, Feng Chen, Erich Jaeger, Ali Moshrefi, Khoa Pham, William Stedman, Tiffany Liang, Michael Saghbini, Zeljko Dzakula, Alex Hastie, Han Cao, Gintaras Deikus, Eric Schadt, Robert Sebra, Ali Bashir, Rebecca M Truty, Christopher C Chang, Natali Gulbahce, Keyan Zhao, Srinka Ghosh, Fiona Hyland, Yutao Fu, Mark Chaisson, Jonathan Trow, Chunlin Xiao, Stephen T Sherry, Alexander W Zaranek, Madeleine Ball, Jason Bobe, Preston Estep, George M Church, Patrick Marks, Sofia Kyriazopoulou-Panagiotopoulou, Grace Zheng, Michael Schnall-Levin, Heather S Ordonez, Patrice A Mudivarti, Kristina Giorda, Marc Salit, Genome in a Bottle Consortium, Extensive sequencing of seven human genomes to characterize benchmark reference materials, http://biorxiv.org/content/early/2015/09/15/026468.
Publication describing the methods used by NIST and GIAB to form v2.18 of the high-confidence SNP, indel, and homozygous reference calls for NA12878
- J.M. Zook, B. Chapman, J. Wang, D. Mittelman, O. Hofmann, W. Hide and M. Salit. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls, Nature Biotechnology Published online Feb. 16, 2014. doi:10.1038/nbt.2835. PMID: 24531798