dib-lab · SensibleSalmon · Aug 4, 2015 · Aug 7, 2015 · Aug 7, 2015 · Aug 7, 2015
diff --git a/CITATION b/CITATION
@@ -1,13 +1,14 @@
 .. vim: set filetype=rst
 
 .. If you update this file then you may need to update the citations in
-   scripts/galaxy/macro.xml and khmer/khmer_args.py as well
+   khmer/khmer_args.py as well
 
+*********
 Citations
----------
+*********
 
 Software Citation
-^^^^^^^^^^^^^^^^^
+=================
 
 If you use the khmer software, you must cite:
 
@@ -38,10 +39,11 @@ To see a quick summary of papers for a given script just run it without using
 any command line arguments.
 
 Graph partitioning and/or compressible graph representation
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+===========================================================
 
-The load-graph.py, partition-graph.py, find-knots.py, load-graph.py,
-and partition-graph.py scripts are part of the compressible graph
+The :program:`load-graph.py`, :program:`partition-graph.py`,
+:program:`find-knots.py`, :program:`load-graph.py`, and
+:program:`partition-graph.py` scripts are part of the compressible graph
 representation and partitioning algorithms described in:
 
    Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT.
@@ -84,10 +86,10 @@ representation and partitioning algorithms described in:
   }
 
 Digital normalization
-^^^^^^^^^^^^^^^^^^^^^
+=====================
 
-The normalize-by-median.py and count-median.py scripts are part of
-the digital normalization algorithm, described in:
+The :program:`normalize-by-median.py` and :program:`count-median.py` scripts
+are part of the digital normalization algorithm, described in:
 
    A Reference-Free Algorithm for Computational Normalization of
    Shotgun Sequencing Data
@@ -108,9 +110,9 @@ the digital normalization algorithm, described in:
   }
 
 Efficient k-mer error trimming
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+==============================
 
-The script trim-low-abund.py is described in:
+The :program:`script trim-low-abund.py` is described in:
 
    Crossing the streams: a framework for streaming analysis of short DNA
    sequencing reads
@@ -121,17 +123,19 @@ The script trim-low-abund.py is described in:
 
   @unpublished{semistream,
       author = "Qingpeng Zhang and Sherine Awad and C. Titus Brown",
-      title = "Crossing the streams: a framework for streaming analysis of short DNA sequencing reads",
+      title = "Crossing the streams: a framework for streaming analysis of
+          short DNA sequencing reads",
       year = "2015",
       eprint = "PeerJ Preprints 3:e1100",
       url = "https://dx.doi.org/10.7287/peerj.preprints.890v1"
   }
 
 K-mer counting
-^^^^^^^^^^^^^^
+==============
 
-The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts
-implement the probabilistic k-mer counting described in:
+The :program:`abundance-dist.py`, :program:`filter-abund.py`, and
+:program:`load-into-counting.py` scripts implement the probabilistic k-mer
+counting described in:
 
    These Are Not the K-mers You Are Looking For: Efficient Online K-mer
    Counting Using a Probabilistic Data Structure
@@ -179,7 +183,7 @@ implement the probabilistic k-mer counting described in:
   }
 
 FASTA and FASTQ reading
-^^^^^^^^^^^^^^^^^^^^^^^
+=======================
 
 Several scripts use the SeqAn library for FASTQ and FASTA reading as described
 in:

diff --git a/Makefile b/Makefile
@@ -4,7 +4,7 @@
 #  and documentation
 # make coverage-report to check coverage of the python scripts by the tests
 
-CPPSOURCES=$(wildcard lib/*.cc lib/*.hh khmer/_khmer.cc)
+CPPSOURCES=$(wildcard lib/*.cc lib/*.hh khmer/_khmer.cc) setup.py
 PYSOURCES=$(wildcard khmer/*.py scripts/*.py)
 SOURCES=$(PYSOURCES) $(CPPSOURCES) setup.py
 DEVPKGS=pep8==1.5.7 diff_cover autopep8 pylint coverage gcovr nose pep257 \
@@ -80,22 +80,26 @@ clean: FORCE
 	rm -f diff-cover.html
 
 debug: FORCE
-	export CFLAGS="-pg -fprofile-arcs"; python setup.py build_ext --debug \
+	export CFLAGS="-pg -fprofile-arcs -D_GLIBCXX_DEBUG_PEDANTIC \
+		-D_GLIBCXX_DEBUG"; python setup.py build_ext --debug \
 		--inplace
 
 ## doc         : render documentation in HTML
 doc: build/sphinx/html/index.html
 
-build/sphinx/html/index.html: $(SOURCES) $(wildcard doc/*.txt) doc/conf.py all
+build/sphinx/html/index.html: $(SOURCES) $(wildcard doc/*.rst) doc/conf.py all
 	./setup.py build_sphinx --fresh-env
 	@echo ''
 	@echo '--> docs in build/sphinx/html <--'
 	@echo ''
 
 ## pdf         : render documentation as a PDF file
+# packages needed include: texlive-latex-recommended,
+# texlive-fonts-recommended, texlive-latex-extra
 pdf: build/sphinx/latex/khmer.pdf
 
-build/sphinx/latex/khmer.pdf: $(SOURCES) doc/conf.py $(wildcard doc/*.txt)
+build/sphinx/latex/khmer.pdf: $(SOURCES) doc/conf.py $(wildcard doc/*.rst) \
+	$(wildcard doc/user/*.rst) $(wildcard doc/dev/*.rst)
 	./setup.py build_sphinx --fresh-env --builder latex
 	cd build/sphinx/latex && ${MAKE} all-pdf
 	@echo ''

diff --git a/doc/conf.py b/doc/conf.py
@@ -165,7 +165,7 @@
 #html_additional_pages = {}
 
 # If false, no module index is generated.
-#html_use_modindex = True
+html_use_modindex = False
 
 # If false, no index is generated.
 #html_use_index = True
@@ -227,4 +227,4 @@
 #latex_appendices = []
 
 # If false, no module index is generated.
-#latex_use_modindex = True
+latex_use_modindex = False
diff --git a/doc/contributors.rst b/doc/contributors.rst
@@ -1,16 +1,17 @@
 .. vim: set filetype=rst
 
-=================================
+*********************************
 Contributors and Acknowledgements
-=================================
+*********************************
 
-khmer is a product of the GED lab at Michigan State University,
+khmer is a product of the Lab for Data Intensive Biology at the University of
+California, Davis (the succesor to the GED lab at Michigan State University),
 
-   http://ged.msu.edu/
+   http://ivory.idyll.org/lab/
 
 ---
 
-C. Titus Brown <ctb@msu.edu> wrote the initial ktable and hashtable
+C. Titus Brown <titus@idyll.org> wrote the initial ktable and hashtable
 implementations, as well as hashbits and counting_hash.
 
 Jason Pell implemented many of the C++ k-mer filtering functions.
@@ -28,6 +29,6 @@ Eric McDonald thoroughly revised many aspects of the code base, made
 much of the codebase thread safe, and otherwise improved performance
 dramatically.
 
-Michael R. Crusoe is the new maintainer of khmer.
+Michael R. Crusoe took over maintainership June, 2013.
 
-MRC 2014-05-07
+Last updated by MRC on 2015-07-31
diff --git a/doc/dev/coding-guidelines-and-review.rst b/doc/dev/coding-guidelines-and-review.rst
@@ -4,7 +4,12 @@ Coding guidelines and code review checklist
 This document is for anyone who want to contribute code to the khmer
 project, and describes our coding standards and code review checklist.
 
-----
+C++ standards
+-------------
+
+Any feature in C++11 is fine to use. Specifically we support features found in
+GCC 4.8.2. See https://github.com/dib-lab/khmer/issues/598 for an in-depth
+discussion.
 
 Coding standards
 ----------------

diff --git a/doc/dev/getting-started.rst b/doc/dev/getting-started.rst
@@ -153,6 +153,16 @@ One-time Preparation
        sudo brew install cppcheck
 
 
+#. ccache installation:
+
+   Debian and Ubuntu Linux distro users can install ``ccache`` to speed up
+   their compile times::
+
+       sudo apt-get install ccache
+       echo 'export PATH="/usr/lib/ccache:$PATH" # enable ccache' >> ~/.bashrc
+       export PATH="/usr/lib/ccache:$PATH"
+
+
 Building khmer and running the tests
 ------------------------------------
 

diff --git a/doc/index.rst b/doc/index.rst
@@ -1,15 +1,26 @@
-.. khmer documentation master file, created by
-   sphinx-quickstart on Wed Aug  4 10:20:23 2010.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
+.. vim: set filetype=rst
 
+#######################################
 khmer -- k-mer counting & filtering FTW
-=======================================
+#######################################
+
+:Authors: Michael R. Crusoe, ACharbonneau, James A. Stapleton, Sherine
+        Awad, Elmar Bucher, Adam Caldwell, Reed Cartwright, Bede Constantinides,
+        Peter Dave Hello, Kevin D. Murray, Greg Edvenson, Hussien F. Alameldin,
+        Scott Fay, Jacob Fenton, Thomas Fenzl, Jordan Fish, Leonor 
+        Garcia-Gutierrez, Phillip Garland, Jonathan Gluck, Iván González, Sarah 
+        Guermond, Jiarong Guo, Aditi Gupta, Andreas Härpfer, Adina Howe,
+        Alex Hyer, Luiz Irber, Alexander Johan Nederbragt, Rhys Kidd, David Lin,
+        Justin Lippi, Heather L. Wiencko, Tamer Mansour, Pamela McA'Nulty, Eric 
+        McDonald, Jessica Mizzi, Kevin Murray, Kaben Nanlohy, Humberto 
+        Ortiz-Zuazaga, Jeramia Ory, Jason Pell, Charles Pepe-Ranney, Rodney 
+        Picett, Ryan R. Boyce, Michael R. Crusoe, Joshua R. Herr, Joshua R. 
+        Nahum, Erich Schwarz, Camille Scott, Josiah Seaman, Scott Sievert, Jared
+        Simpson, James Spencer, Ramakrishnan Srinivasan, Daniel Standage, Joe 
+        Stein, Susan Steinman, Benjamin Taylor, C. Titus Brown, Will Trimble,
+        Connor T. Skennerton, Michael Wright, Brian Wyss, Qingpeng Zhang, en 
+        zyme, C. Titus Brown
 
-:Authors: Michael R. Crusoe, Greg Edvenson, Jordan Fish, Adina Howe,
-          Luiz Irber, Eric McDonald, Joshua Nahum, Kaben Nanlohy, Humberto
-          Ortiz-Zuazaga, Jason Pell, Jared Simpson, Camille Scott, Ramakrishnan
-          Rajaram Srinivasan, Qingpeng Zhang, and C. Titus Brown
 
 :Contact: khmer-project@idyll.org
 :GitHub: https://github.com/dib-lab/khmer
@@ -18,7 +29,7 @@ khmer -- k-mer counting & filtering FTW
 
 
 khmer is a library and suite of command line tools for working with
-DNA sequence.  It is primarily aimed at short-read sequencing data
+DNA sequences.  It is primarily aimed at short-read sequencing data
 such as that produced by the Illumina platform.  khmer takes a k-mer-centric
 approach to sequence analysis, hence the name.
 
@@ -34,7 +45,8 @@ the following URLs:
 
     * Announcements: http://lists.idyll.org/listinfo/khmer-announce
 
-The archives for the khmer list are available at: http://lists.idyll.org/pipermail/khmer/
+The archives for the khmer mailing list are available at: 
+http://lists.idyll.org/pipermail/khmer/
 
 khmer development has largely been supported by AFRI Competitive Grant
 no.  `2010-65205-20361
@@ -44,8 +56,6 @@ Institute of the National Institutes of Health under Award Number
 `R01HG007513 <http://ged.msu.edu/downloads/2012-bigdata-nsf.pdf>`__ through
 May 2016, both to C. Titus Brown.
 
-Contents:
-
 .. toctree::
    :maxdepth: 1
 

diff --git a/doc/introduction.rst b/doc/introduction.rst
@@ -1,8 +1,8 @@
 .. vim: set filetype=rst
 
-=====================
+*********************
 Introduction to khmer
-=====================
+*********************
 
 Introduction
 ============
@@ -11,7 +11,7 @@ khmer is a library and toolkit for doing k-mer-based dataset analysis and
 transformations.  Our focus in developing it has been on scaling assembly of 
 metagenomes and mRNA.
 
-khmer can be used for a number of transformations, include inexact 
+khmer can be used for a number of transformations, including inexact 
 transformations (abundance filtering and error trimming) and exact 
 transformations (graph-size filtering, to throw away disconnected reads; and 
 partitioning, to split reads into disjoint sets).  Of these, only partitioning 
@@ -34,16 +34,16 @@ will never incorrectly report a k-mer as being absent when it *is* present.
 This one-sided error makes the Bloom filter very useful for certain kinds of 
 operations.
 
-khmer is also independent of K, and currently works for K <= 32.  We will be 
-integrating code for up to K=64 soon.
+khmer is also independent of a specific k-size (K), and currently works for 
+K <= 32.  We will be integrating code for K<=64 soon.
 
 khmer is implemented in C++ with a Python wrapper, which is what all of the 
 scripts use.
 
-Some important documentation for khmer is provided on the Web sites for 
+Documentation for khmer is provided on the Web sites for 
 `khmer-protocols <http://khmer-protocols.readthedocs.org>`__ and `khmer-recipes 
 <http://khmer-recipes.readthedocs.org>`__. khmer-protocols provides detailed 
-protocols for using khmer to analyze either a transcriptome or a metagenome; 
+protocols for using khmer to analyze either a transcriptome or a metagenome. 
 khmer-recipes provides individual recipes for using khmer in a variety of 
 sequence-oriented tasks such as extracting reads by coverage, estimating a 
 genome or metagenome size from unassembled reads, and error-trimming reads via 
@@ -71,7 +71,7 @@ immediately useful for a few different operations, including:
 
  - optimizing assemblies on various parameters;
 
- - converting FASTA to FASTQ;
+ - converting FASTQ to FASTA;
 
 and a few other random functions.
 
@@ -94,6 +94,8 @@ Copyright and license
 =====================
 
 Portions of khmer are Copyright California Institute of Technology,
-where the exact counting code was first developed; the remainder is
-Copyright Michigan State University.  The code is freely available for
+where the exact counting code was first developed. All other code developed
+through 2014 is copyright Michigan State University. All developed code through
+2015 is copyright University of California Davis.  
+All the code is freely available for
 use and re-use under the BSD License.
diff --git a/doc/user/biblio.rst b/doc/user/biblio.rst
@@ -3,15 +3,28 @@
 An incomplete bibliography of papers using khmer
 ================================================
 
+
+Biological uses outside of the group
+------------------------------------
+
+http://www.ncbi.nlm.nih.gov/sites/myncbi/1ruvipqAmaMkN/collections/48107393/public/
+
+Tools building on khmer concepts
+--------------------------------
+
+http://www.ncbi.nlm.nih.gov/sites/myncbi/1ruvipqAmaMkN/collections/48101567/public/
+
+Papers in collaboration with our group
+--------------------------------------
+
+http://www.ncbi.nlm.nih.gov/sites/myncbi/1ruvipqAmaMkN/collections/48107445/public/
+
 Digital normalization
 ---------------------
 
 Multiple Single-Cell Genomes Provide Insight into Functions of
 Uncultured Deltaproteobacteria in the Human Oral Cavity.  Campbell et
-al., PLoS One, 2013, doi:10.1371/journal.pone.0059361.  [ `paper link <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059361>`__ ]
+al., PLoS One, 2013, doi:10.1371/journal.pone.0059361.  [ `paper link
+<http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059361>`__ ]
+
 
-Insights into archaeal evolution and symbiosis from the genomes of a
-nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool,
-Yellowstone National Park.  Podar et al., Biology Direct, 2013
-doi:10.1186/1745-6150-8-9.
-[ `paper link <http://www.biology-direct.com/content/8/1/9/abstract>`__ ]