@@ -69,6 +69,77 @@ public, both in Qiita and the permanent repository, Figure 2.
69
69
study listing page.
70
70
71
71
72
+ Qiita allows for complex study designs
73
+ --------------------------------------
74
+
75
+ As seen in Figure 1 studies are the main source of data for Qiita, and studies
76
+ can contain only one set of samples but can also contain multiple sets, each of
77
+ which can have a different preparations.
78
+
79
+ The traditional study design includes a single sample and a single preparation
80
+ information file. However as technology improves, study designs become more
81
+ complex where a study with a defined set of collected samples can have subsets
82
+ prepared in different ways so we can answer different questions. For example,
83
+ let's imagine a study looking at how different `microbial communities changes
84
+ during mammalian corpse decomposition
85
+ <https://www.ncbi.nlm.nih.gov/pubmed/26657285> `__; thus, your full study design
86
+ is to collect a set of samples, which you will then process with 16S, 18S and
87
+ ITS primers. This will result in 1 sample and 3 preparation information files,
88
+ `see it in Qiita <https://qiita.ucsd.edu/study/description/10141 >`__.
89
+
90
+ Now, let's imagine other more complex example:
91
+
92
+ 1. All of the samples were prepped for 16S and sequenced in two separate
93
+ MiSeq runs
94
+
95
+ 2. 50 of the samples were prepped for 18S and ITS, and sequenced in a single
96
+ MiSeq run
97
+
98
+ 3. 50 of the samples were prepped for WGS and sequenced on a single
99
+ HiSeq run
100
+
101
+ 4. 30 of the samples have metabolomic profiles
102
+
103
+ To represent this project in Qiita, you will need to create a single
104
+ study with a single sample information file that contains all 100 of the
105
+ samples. Separately, you will need to create four prep information files that
106
+ describe the preparations for the corresponding samples. All raw data
107
+ uploaded will need to correspond to a specific preparation (prep) information
108
+ file. For instance, the data sets described above would require the following
109
+ data and prep information:
110
+
111
+ 1. All of the samples prepped for 16S and sequenced in two separate
112
+ MiSeq runs
113
+
114
+ a) 1 prep information file describing the two MiSeq runs (use a
115
+ run\_ prefix column to differentiate between the two MiSeq runs, more
116
+ on metadata below) where the 100 samples are represented
117
+ b) the 4-6 fastq raw data files without demultiplexing (i.e., the
118
+ forward, reverse (optional), and barcodes for each run)
119
+
120
+ 2. 50 of the samples prepped for 18S and ITS, and sequenced in a single
121
+ MiSeq run
122
+
123
+ a) prep information files, one describing the 18S and the other describing the
124
+ ITS preparations
125
+ b) the 2-3 fastq raw data files (forward, reverse (optional), and
126
+ barcodes)
127
+
128
+ 3. 50 of the samples prepped for WGS and sequenced on a single HiSeq run
129
+
130
+ a) 1 prep information files describing how the samples were multiplexed
131
+ b) the 2-3 fastq raw data files (forward, reverse (optional), and
132
+ barcodes).
133
+ c) NOTE: We currently do not have a processing pipeline for WGS but
134
+ should soon.
135
+
136
+ 4. 30 of the samples with metabolomic profiles
137
+
138
+ a) 1 prep information file. the raw data file(s) from the metabolomic
139
+ characterization.
140
+ b) NOTE: We currently do not have a processing pipeline for metabolomics but
141
+ should soon.
142
+
72
143
Portals
73
144
-------
74
145
0 commit comments