Skip to content

Processing recommendations and pipelines #2049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 18, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions qiita_pet/support_files/doc/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ example go
the "Upload instructions"
`here <https://www.google.com/url?q=https%3A%2F%2Fvamps.mbl.edu%2Fmobe_workshop%2Fwiki%2Findex.php%2FMain_Page&sa=D&sntz=1&usg=AFQjCNE4PTOKIvFNlWtHmJyLLy11mfzF8A>`__.

.. _example_study_processing_workflow:

Example study processing workflow
---------------------------------

Expand Down
1 change: 1 addition & 0 deletions qiita_pet/support_files/doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ following documents:
dev/index.rst
faq.rst
resources.rst
processing-recommendations.rst
65 changes: 65 additions & 0 deletions qiita_pet/support_files/doc/source/processing-recommendations.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Processing recommendations
==========================

Currently, Qiita supports the processing :sup:`(*)` of raw data from:

#. Target gene barcoded sequencing
#. Shotgun sequencing

Note that the selected processing are mainly guided so we can perform meta-analyses, this is combine different studies, even from different wet lab techniques or
sequencing technologies. Remember to check the :ref:`example_study_processing_workflow` before continuing.

For more information about meta-analysis, examples and things to consider:

- `"Tiny microbes, enormous impacts: what matters in gut microbiome studies?" <https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1086-x>`_
- `"Meta-analyses of studies of the human microbiota" <http://genome.cshlp.org/content/23/10/1704.short>`_.
- `"A Guide to Enterotypes across the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets" <http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002863>`_.
- `"Dynamic changes in short- and long-term bacterial composition following fecal microbiota transplantation for recurrent Clostridium difficile infection" <http://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-015-0070-0>`_.

:sup:`(*)` Remember that you can also upload BIOM tables for plotting but not covered here because this is only for raw data.

Target gene barcoded sequencing
-------------------------------

For this you can start with raw, not demultiplexed data or per_sample_FASTQ, see :ref:`example_study_processing_workflow`. Either way, you will need to
"Split libraries and QC", which uses the default in QIIME 1.9.1. Once your demultiplexed and QCed artifact is created you need to select which processing to perform.
There are two main ideologies/methodologies to process target gene data: sequence clustering and sequence cleanup.

Sequencing cleanup (preferred)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For this we use `deblur <https://github.com/biocore/deblur>`_. Here 2 BIOM tables are generated by default: fina.biom and final.only-16s.biom. The former is the full biom table, which can be used with any target gene and wetlab work;
the latter is the trimmed version to those sequences that match Greengenes at 80% similarity, a really basic and naive filtering. Each of those BIOM tables, is accompanied by a FASTA that contains
the representative sequences. The OTU IDs are given by the unique sequence.

Note that deblur needs all sequences to be trimmed at the same length, thus the recommended pipeline is to trim everything at 150bp and the deblur.

Sequencing clustering
^^^^^^^^^^^^^^^^^^^^^

Here we use close reference picking, for an explanation of the different picking methods see
`"Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences" <https://peerj.com/articles/545/>`_.
Here we generate a single BIOM table with the OTUs/per-sample. The OTU IDs are given based on the reference database selected.

Currently, we have the reference databases: Greengenes version 3_8-97, Silva 119 and Unite 7. Depending on your selection is if the reference has a phylogenetic tree.


Shotgun sequencing
------------------

Qiita currently has one shotgun metagenomics data analysis pipeline. Note that this is the initial processing pipeline and we will be adding more soon.

With that said, the current workflow is as follows:

#. Removal of adapter sequence and host contamination using `KneadData <https://bitbucket.org/biobakery/kneaddata/wiki/Home>`_.
#. Gene calling and pathway profiling using `HUMAnN2 <https://bitbucket.org/biobakery/humann2/wiki/Home>`_.

This workflow starts with per_sample_FASTQ files. We recommend only uploading sequences that have already been through QC and host /
human sequence removal. However, all sequence files currently are required to go through KneadData to ensure they are ready for
subsequent analyses. Currently, the KneadData command removes adaptor sequences (choice of TruSeq3-PE-2 and NexteraPE-PE) and
sequences mapping to the human genome (additional host genomes will become available soon).

Next, the QC'd sequences will be compared against reference databases to determine the presence and abundance of protein-coding functional genes and
pathways using HUMAnN2. These are then summarized as BIOM tables, which can be used in subsequent analysis and visualization.

For more information visit the `Shotgun Qiita Plugin GitHub page <https://github.com/qiita-spots/qp-shotgun>`.
1 change: 1 addition & 0 deletions qiita_pet/templates/study_ajax/prep_summary.html
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,7 @@
"<div class='col-md-12'>" +
"<h4><a class='btn btn-info' id='show-hide-btn' onclick='toggle_graphs();'>-</a><i> Files network</i></h4>" +
"<b>(Click nodes for more information, blue are jobs)</b>" +
"<br/>Check our data <a target='_blank' href='{% raw qiita_config.portal_dir %}/static/doc/html/processing-recommendations.html' onclick='return !window.open(this.href, \"Qiita processing recommendations\", \"width=800,height=500\")'>processing recommendations</a>." +
"</div>" +
"</div>" +
"<div class='row'><div class='col-md-12 graph' style='width:90%' id='graph-network-div'>" +
Expand Down
8 changes: 7 additions & 1 deletion qiita_pet/templates/study_ajax/processing_artifact.html
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,13 @@ <h4>Processing {{name}} (ID: {{artifact_id}})</h4>

<div class="row">
<div class="col-md-12">
<h4><i> Processing workflow</i> <button class="btn btn-primary btn-sm" onclick="run_workflow();" id='run-btn' disabled><span class="glyphicon glyphicon-play"></span> Run</button></h4>
<h4>
<i>Processing workflow</i>
<button class="btn btn-primary btn-sm" onclick="run_workflow();" id='run-btn' disabled><span class="glyphicon glyphicon-play"></span> Run</button>
</h4>
<h5>
Wondering what to select? Check our data <a target='_blank' onclick="return !window.open(this.href, 'Qiita processing recommendations', 'width=800,height=500')" href='{% raw qiita_config.portal_dir %}/static/doc/html/processing-recommendations.html'>processing recommendations</a>.
</h5>
</div>
</div>

Expand Down