-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy path01-introduction.Rmd
210 lines (136 loc) · 9.71 KB
/
01-introduction.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
---
bibliography: references.bib
---
# Introduction
Information in living organism communicates along the Central Dogma in different scales from individual, population, community to ecosystem. Metabolomics (i.e., the profiling and quantification of metabolites) is a relatively new field of "omics" studies. Different from other omics studies, metabolomics always focused on small molecular (molecular weight below 1500 Da) with much lower mass than polypeptide with single or doubled charged ions. Here is a demo of the position of metabolomics in "omics" studies[\@b.dunn2011].
```{r metaintro, fig.show='hold', fig.cap='The complex interactions of functional levels in biological systems.',echo=FALSE,out.width='90%'}
knitr::include_graphics('images/metaintro.png')
```
Metabolomics studies always employ GC-MS[@theodoridis2012; @beale2018], GC\*GC-MS[@tian2016], LC-MS[@gika2014], LC-MS/MS[@begou2017], IM-MS[@levy2019], infrared ion spectroscopy[@martens2017] or NMR[\@b.dunn2011] to measure metabolites. For analytical methods, this review could be checked[@zhang2012e]. The overall technique progress of metabolomics (2012-2018) could be found here[@miggiels2019]. However, this workflow will only cover mass spectrometry based metabolomics or XC-MS based research.
## History
### History of Mass Spectrometry
Here is a historical commentary for mass spectrometry[@yatesiii2011]. In details, here is a summary:
- 1913, Sir Joseph John Thomson "Rays of Positive Electricity and Their Application to Chemical Analyses."
```{r history, fig.show='hold', fig.cap='Sir Joseph John Thomson "Rays of Positive Electricity and Their Application to Chemical Analyses."',echo=FALSE,out.width='90%'}
knitr::include_graphics('images/mshistory.jpg')
```
- Petroleum industry bring mass spectrometry from physics to chemistry
- The first commercial mass spectrometer is from Consolidated Engineering Corp to analysis simple gas mixtures from petroleum
- In World War II, U.S. use mass spectrometer to separate and enrich isotopes of uranium in Manhattan Project
- U.S. also use mass spectrometer for organic compounds during wartime and extend the application of mass spectrometer
- 1946, TOF, William E. Stephens
- 1970s, quadrupole mass analyzer
- 1970s, R. Graham Cooks developed mass-analyzed ion kinetic energy spectrometry, or MIKES to make MRM analysis for multi-stage mass sepctrometry
- 1980s, MALDI rescue TOF and mass spectrometry move into biological application
- 1990s, Orbitrap mass spectrometry
- 2010s, Aperture Coding mass spectrometry
### History of Metabolomcis
You could check this report[@baker2011]. According to this book section[@kusonmano2016]:
```{r history2, fig.show='hold', fig.cap='Metabolomics timeline during pre- and post-metabolomics era',echo=FALSE,out.width = '90%'}
knitr::include_graphics('images/metahistory.jpg')
```
- 2000-1500 BC some traditional Chinese doctors who began to evaluate the glucose level in urine of diabetic patients using ants
- 300 BC ancient Egypt and Greece that traditionally determine the urine taste to diagnose human diseases
- 1913 Joseph John Thomson and Francis William Aston mass spectrometry
- 1946 Felix Bloch and Edward Purcell Nuclear magnetic resonance
- late 1960s chromatographic separation technique
- 1971 Pauling's research team "Quantitative Analysis of Urine Vapor and Breath by Gas--Liquid Partition Chromatography"
- Willmitzer and his research team pioneer group in metabolomics which suggested the promotion of the metabolomics field and its potential applications from agriculture to medicine and other related areas in the biological sciences
- 2007 Human Metabolome Project consists of databases of approximately 2500 metabolites, 1200 drugs, and 3500 food components
- post-metabolomics era high-throughput analytical techniques
### Defination
Metabolomics is actually a comprehensive analysis with identification and quantification of both known and unknown compounds in an unbiased way. Metabolic fingerprinting is working on fast classification of samples based on metabolite data without quantifying or identification of the metabolites. Metabolite profiling always need a pre-defined metabolites list to be quantification[@madsen2010].
Meanwhile, targeted and untargeted metabolomics are also used in publications. For targeted metabolomics, the majority of the molecules within a biological pathway or a defined group of related metabolites are determined. Sometimes broad collection of known metabolites could also be referred as targeted analysis. Untargeted analysis detect all of possible metabolites unbiased in the samples of interest. A similar concept called non-targeted analysis/screen is actually describe the similar studies or workflow.
## Reviews and tutorials
Some nice reviews and tutorials related to this workflow could be found in those papers or directly online:
### Workflow
Those papers are recommended[@gonzalez-riano2020; @pezzatti2020; @liu2019; @barnes2016a; @cajka2016; @gika2014; @theodoridis2012; @lu2008; @fiehn2002] for general metabolomics related topics.
- For targeted metabolomics, you could check those reviews[@griffiths2010; @lu2008a; @weljie2006; @yuan2012; @zhou2016; @begou2017].
### Data analysis
You could firstly read those papers[@barnes2016; @kusonmano2016; @madsen2010; @uppal2016; @alonso2015] to get the concepts and issues for data analysis in metabolomics. Then this paper[@gromski2015] could be treated as a step-by-step tutorial. For GC-MS based metabolomics, check this paper[@rey-stolle2022].
- A guide could be used choose a inofrmatics software and tools for lipidomics[@ni2022].
- For annotation, this paper[@domingo-almenara2018] is a well organized review.
- For database used in metabolomics, you could check this review[@vinaixa2016].
- For metabolomics software, check this series of reviews for each year[@misra2016; @misra2017; @misra2018].
- For open sourced software, those reviews[@chang2021; @spicer2017; @dryden2017] could be a good start.
- For DIA or DDA metabolomics, check those papers[@fenaille2017; @bilbao2015].
Here is the slides for metabolomics data analysis workshop and I have made presentations twice in UWaterloo and UC Irvine.
- [Introduction](http://yufree.github.io/presentation/metabolomics/introduction#1)
- [Statistical Analysis](http://yufree.github.io/presentation/metabolomics/StatisticalAnalysis#1)
- [Batch Correction](http://yufree.github.io/presentation/metabolomics/BatchCorrection#1)
- [Annotation](http://yufree.github.io/presentation/metabolomics/Annotation#1)
### Application
- For environmental research related metabolomics or exposome, check those papers[@matich2019; @tang2020; @warth2017; @bundy2009].
- For toxicology, check this paper[@viant2019].
- Check this piece[@wishart2016] for drug discovery and precision medicine.
- For food chemistry, check this paper[@castro-puyana2017], this paper for livestock[@goldansaz2017] and those papers for nutrition[@allam-ndoul2016; @jones2012; @muller2020].
- For disease related metabolomics such as oncology[@spratlin2009], Cardiovascular[@cheng2017] . This paper[@kennedy2018] cover the metabolomics realted clinic research.
- For plant science, check those paper[@sumner2003; @jorge2016a; @hansen2018].
- For single cell metabolomics analysis, check here[@fessenden2016; @zenobi2013; @ali2019; @hansen2018].
- For gut microbiota, check here[@smirnov2016].
### Challenge
General challenge for metabolomics studies could be found here [@schymanski2017; @uppal2016; @schrimpe-rutledge2016; @wolfender2015].
- For reproducible research, check those papers [@du2022; @place2021; @verhoeven2020; @mangul2019; @wallach2018; @hites2018; @considine2017; @sarpe2017]. To match data from different LC system, [M2S](https://github.com/rjdossan/M2S) could be used[@climacopinto2022].
- Quantitative Metabolomics related issues could be found here[@kapoore2016b; @jorge2016a; @lv2022; @vitale2022].
- For quality control issues, check here[@dudzik2018; @siskos2017; @sumner2007; @place2021;@broeckling2023;@gonzalez-dominguez2024]. You might also try postcolumn infusion as a quality control tool[@gonzalez2022].
## Trends in Metabolomics
```{r rentrez, eval=F}
library(rentrez)
papers_by_year <- function(years, search_term){
return(sapply(years, function(y) entrez_search(db="pubmed",term=search_term, mindate=y, maxdate=y, retmax=0)$count))
}
years <- 2002:2022
total_papers <- papers_by_year(years, "")
omics <- c("genomics", "epigenomics", "metagenomic", "proteomics", "transcriptomics","metabolomics","exposomics")
trend_data <- sapply(omics, function(t) papers_by_year(years, t))
trend_props <- trend_data/total_papers
library(reshape)
library(ggplot2)
trend_df <- melt(data.frame(years, trend_data), id.vars="years")
p <- ggplot(trend_df, aes(years, value, colour=variable))
p + geom_line(size=1) + scale_y_log10("number of papers") + theme_bw()
```
## Workflow
```{r, echo = F}
library(DiagrammeR)
DiagrammeR::grViz("digraph workflow {
node [shape = box]
A [label = '@@1']
B [label = '@@2']
C [label = '@@3']
D [label = '@@4']
E [label = '@@5']
F [label = '@@6']
G [label = '@@7']
H [label = '@@8']
I [label = '@@9']
J [label = '@@10']
K [label = '@@11']
L [label = '@@12']
M [label = '@@13']
N [label = '@@14']
O [label = '@@15']
A -> B -> C -> D -> E -> F -> G -> H
H -> I
I -> J
H -> J -> K -> L
L -> M -> N
L -> O
}
[1]: 'raw data'
[2]: 'open source format'
[3]: 'DoE folder'
[4]: 'peaks list'
[5]: 'retention time correction'
[6]: 'peaks grouping'
[7]: 'peaks filling'
[8]: 'raw peaks'
[9]: 'data visulization'
[10]: 'batch effects correction'
[11]: 'corrected peaks'
[12]: 'annotation'
[13]: 'metabolomics pathway analysis'
[14]: 'omics analysis'
[15]: 'biomarkers discovery/diagnoise'
",width = 300)
```