|
| 1 | +Processing recommendations |
| 2 | +========================== |
| 3 | + |
| 4 | +Currently, Qiita supports the processing :sup:`(*)` of raw data from: |
| 5 | + |
| 6 | +#. Target gene barcoded sequencing |
| 7 | +#. Shotgun sequencing |
| 8 | + |
| 9 | +Note that the selected processing are mainly guided so we can perform meta-analyses, this is combine different studies, even from different wet lab techniques or |
| 10 | +sequencing technologies. Remember to check the :ref:`example_study_processing_workflow` before continuing. |
| 11 | + |
| 12 | +For more information about meta-analysis, examples and things to consider: |
| 13 | + |
| 14 | +- `"Tiny microbes, enormous impacts: what matters in gut microbiome studies?" <https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1086-x>`_ |
| 15 | +- `"Meta-analyses of studies of the human microbiota" <http://genome.cshlp.org/content/23/10/1704.short>`_. |
| 16 | +- `"A Guide to Enterotypes across the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets" <http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002863>`_. |
| 17 | +- `"Dynamic changes in short- and long-term bacterial composition following fecal microbiota transplantation for recurrent Clostridium difficile infection" <http://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-015-0070-0>`_. |
| 18 | + |
| 19 | +:sup:`(*)` Remember that you can also upload BIOM tables for plotting but not covered here because this is only for raw data. |
| 20 | + |
| 21 | +Target gene barcoded sequencing |
| 22 | +------------------------------- |
| 23 | + |
| 24 | +For this you can start with raw, not demultiplexed data or per_sample_FASTQ, see :ref:`example_study_processing_workflow`. Either way, you will need to |
| 25 | +"Split libraries and QC", which uses the default in QIIME 1.9.1. Once your demultiplexed and QCed artifact is created you need to select which processing to perform. |
| 26 | +There are two main ideologies/methodologies to process target gene data: sequence clustering and sequence cleanup. |
| 27 | + |
| 28 | +Sequencing cleanup (preferred) |
| 29 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 30 | + |
| 31 | +For this we use `deblur <https://github.com/biocore/deblur>`_. Here 2 BIOM tables are generated by default: fina.biom and final.only-16s.biom. The former is the full biom table, which can be used with any target gene and wetlab work; |
| 32 | +the latter is the trimmed version to those sequences that match Greengenes at 80% similarity, a really basic and naive filtering. Each of those BIOM tables, is accompanied by a FASTA that contains |
| 33 | +the representative sequences. The OTU IDs are given by the unique sequence. |
| 34 | + |
| 35 | +Note that deblur needs all sequences to be trimmed at the same length, thus the recommended pipeline is to trim everything at 150bp and the deblur. |
| 36 | + |
| 37 | +Sequencing clustering |
| 38 | +^^^^^^^^^^^^^^^^^^^^^ |
| 39 | + |
| 40 | +Here we use close reference picking, for an explanation of the different picking methods see |
| 41 | +`"Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences" <https://peerj.com/articles/545/>`_. |
| 42 | +Here we generate a single BIOM table with the OTUs/per-sample. The OTU IDs are given based on the reference database selected. |
| 43 | + |
| 44 | +Currently, we have the reference databases: Greengenes version 3_8-97, Silva 119 and Unite 7. Depending on your selection is if the reference has a phylogenetic tree. |
| 45 | + |
| 46 | + |
| 47 | +Shotgun sequencing |
| 48 | +------------------ |
| 49 | + |
| 50 | +Here you need to start with per_sample_FASTQ, we recommend to only upload already QC-ed and adaptor and human sequences removed FASTQ files. However, we have a step for |
| 51 | +this preprocessing available in Qiita via `KneadData <https://bitbucket.org/biobakery/kneaddata/wiki/Home>`_. |
| 52 | + |
| 53 | +The recommended processing steps are: |
| 54 | + |
| 55 | +#. Remove adapters and human sequences from your files using KneadData. We currently have TruSeq3-PE-2 and NexteraPE-PE adaptor removal. |
| 56 | +#. Use `HUMAnN2 <https://bitbucket.org/biobakery/humann2/wiki/Home>`_ to generate BIOM tables. |
| 57 | + |
| 58 | +For more information visit the `Shotgun Qiita Plugin GitHub page<https://github.com/qiita-spots/qp-shotgun>`. |
0 commit comments