@@ -69,6 +69,73 @@ public, both in Qiita and the permanent repository, Figure 2.
69
69
study listing page.
70
70
71
71
72
+ Qiita allows for complex study designs
73
+ --------------------------------------
74
+
75
+ As seen in Figure 1 studies are the main source of data for Qiita, and studies
76
+ can contain only one set of samples but can also contain multiple sets, each of
77
+ which can have a different preparations.
78
+
79
+ The traditional study design includes a single sample and a single preparation
80
+ information file. However as technology improves, study designs become more
81
+ complex where a study with a defined set of collected samples can have subsets
82
+ prepared in different ways so we can answer different questions. For example,
83
+ let's imagine a study looking at how different `microbial communities changes
84
+ during mammalian corpse decomposition
85
+ <https://www.ncbi.nlm.nih.gov/pubmed/26657285> `__.; thus, your full study design
86
+ is to collect a set of samples, which you will then process with 16S, 18S and
87
+ ITS primers. This will result in 1 sample and 3 preparation information files,
88
+ `see it in Qiita <https://qiita.ucsd.edu/study/description/10141 >`__.
89
+
90
+ Now, let's imagine other more complex examples:
91
+ 1. All of the samples were prepped for 16S and sequenced in two separate
92
+ MiSeq runs
93
+ 2. 50 of the samples were prepped for 18S and ITS, and sequenced ina single
94
+ MiSeq run
95
+ 3. 50 of the samples were prepped for WGS and sequenced on a single
96
+ HiSeq run
97
+ 4. 30 of the samples have metabolomic profiles
98
+
99
+ To represent this project in Qiita, you will need to create a single
100
+ study with a single sample information file that contains all 100 of the
101
+ samples. Separately, you will need to create four prep information files that
102
+ describe the preparations for the corresponding samples. All raw data
103
+ uploaded will need to correspond to a specific preparation (prep) information
104
+ file. For instance, the data sets described above would require the following
105
+ data and prep information:
106
+
107
+ 1. All of the samples prepped for 16S and sequenced in two separate
108
+ MiSeq runs
109
+
110
+ a) 1 prep information file describing the two MiSeq runs (use a
111
+ run\_ prefix column to differentiate between the two MiSeq runs, more
112
+ on metadata below) where the 100 samples are represented
113
+ b) the 4-6 fastq raw data files without demultiplexing (i.e., the
114
+ forward, reverse (optional), and barcodes for each run)
115
+
116
+ 2. 50 of the samples prepped for 18S and ITS, and sequenced in a single
117
+ MiSeq run
118
+
119
+ a) prep information files, one describing the 18S and the other describing the
120
+ ITS preparations
121
+ b) the 2-3 fastq raw data files (forward, reverse (optional), and
122
+ barcodes)
123
+
124
+ 3. 50 of the samples prepped for WGS and sequenced on a single HiSeq run
125
+
126
+ a) 1 prep information files describing how the samples were multiplexed
127
+ b) the 2-3 fastq raw data files (forward, reverse (optional), and
128
+ barcodes).
129
+ c) NOTE: We currently do not have a processing pipeline for WGS but
130
+ should soon.
131
+
132
+ 4. 30 of the samples with metabolomic profiles
133
+
134
+ a) 1 prep information file. the raw data file(s) from the metabolomic
135
+ characterization.
136
+ b) NOTE: We currently do not have a processing pipeline for metabolomics but
137
+ should soon.
138
+
72
139
Portals
73
140
-------
74
141
0 commit comments