normalization

Normalization

(disclaimer: I'm not a fan of the term "normalization". "Standardization" might be a better one.)

This should serve as a first introductory text in normalization, and a review of review papers that I've read. Some accompanying R vignettes that I've written are here.

Why Normalize?

Generically speaking, the aim of normalization (not just within mass spec data) is to convert measurements from different scales onto a common scale of some kind. Examples:

Converting measurements in Fahrenheit and Celsius into Kelvin (or to/from each other)
Internal calibration curves are constructed by dividing the measured concentrations and intensities of samples by the measured concentrations and intensities of standards.
If technical replicates from the same samples were processed on separate machines, and one machine is known to give more output than another by some unknown constant, then the difference between machines can be "normalized away" by centering their outputs around a mean of zero.

In general, we wish to preserve biological variability (that is, the "natural" variability of measurement inherent to almost any phenomena being studied; e.g. gene expression levels, ion intensity levels, microarray signal intensity), and to remove all other kinds of unwanted variability (e.g. errors in pipetting, sample collection, lab hardware etc.).

In a sense, dealing with known sources of technical variability is easy, in the sense that one can just "do what feels right". Some methods are reviewed in [6](See table 2 for a review on different sample types, e.g. urine, cell extracts). Some examples of more difficult situations where intuition fails:

It's difficult to differentiate between biological and technical variability after the fact.
It's difficult to deal with situations where it's not possible to normalize away unknown quantities of technical variation even from known sources of technical variability (e.g. dilution amounts/dry weights between different biofluids, e.g. urine and blood samples).
It's especially difficult to normalize away unknown sources of technical variability.

One useful paradigm/philosophy/principle of normalization is to apply a blunt instrument approach which preserves only known sources of variation of interest, removing all other sources of variation. For instance, to remove variation between samples, we may subtract the sample mean from each sample value, such that all sample means are centred to zero. If each of the samples are from different cell types (which have different cell sizes), for example, which we may expect to yield different average levels of a certain metabolite, but difference between cell sizes is not the biological factor of interest, then we might choose to ignore this variability between cell types.

This approach works especially well for experiments with very few (less than half a dozen; ideally only one or two) variables of interest, because it becomes very easy to distinguish between variability of the phenomena under study, and (unwanted) error arising from execution of the experiment. Anything else that detracts from this paradigm becomes trickier to deal with.

Current Normalization Procedure

MA currently does a log2 transform, followed by median-normalization. This is similar to the method introduced in [1]. To re-iterate (from [1]), the advantages of this procedure are:

A log transform decouples a multiplicative error term into an additive one (see [1], model 1). This works for microarray data, where variance increases with signal intensity. A log transform also compresses the range for metabolites that have a large range of values. However, this sometimes has the unintended side effect of increasing (in relative terms) the variance of metabolites that otherwise originally had a low variance. The vignette in the /vsn folder illustrates this for a particular LC dataset.
Sample-wise median normalization removes variation between samples (regardless of whether this variation is unwanted or not), by shifting (subtracting) each value by the median of the respective sample median. This is used for microarray data to fit a linear model (See [1], model 1 or 2).

From the current literature, this procedure is as good a choice as any other. There is no consensus on what constitutes a universally superior method of normalization; vsn is best-in-class according to some metrics[2] by a margin that frankly isn't very large, and different normalization methods can produce wildly varying results in downstream differential abundance analysis [1, 5].

The "data science" way around this is to throw all the normalization methods at the data, and take a consensus of some kind. This can be manually achieved with a web tool, normalyzerDE that does normalization and comparison of results; I haven't tried it out yet. A web tool named NOREVA/Metapre, based on [3], is also purportedly somewhere out there, but I never did find it (links within the publication are dead).

Other Normalization Procedures

Accuracy of different normalization methods, from [3]. Each number in each cell represents (true positive rate + false negative rate)/(true positive rate + false positive rate + true negative rate + false negative rate) = (true positive rate + false negative rate)/2. The left heatmap represents the MTBLS28 dataset with positive-mode ESI, the right with negative-mode ESI. Normalization methods are on the vertical axis, and dataset subset sizes are on the horizontal axis. Overall, vsn looks like a strong contender.

vsn - It looks like vsn is a good contender out of all contemporary methods, being ranked as the best most of the time according to [2] and [3]. It's classed as a method that "adjust(s) biases among various metabolites to reduce heteroscedasticity"[3] (as opposed to a method that removes unwanted between-sample variation). Notably, for microarray data, feature sample mean and feature sample variances are known to be positively correlated, which is not ideal given that a t-test requires mean and variance to be independent; a variance-stabilizing transform which makes mean and sample un-correlated does help (see the meanSdPlot() function in the vsn package).

Quantile normalization -

Probabilistic Quotient Normaliation (PQN) -

One advantage of removing correlation between means and variances is that the t-test requires means and variances to be independent [4].(You may notice that "uncorrelated" and "independent" are not the same, but by and large, statistical independence is too difficult to ascertain in practice).

[1] There is no silver bullet – a guide to low-level data transforms and normalisation methods for microarray data. Kreil DP et. al., Brief Bioinform. (2005)
[2] A systematic evaluation of normalization methods in quantitative label-free proteomics, Välikangas T et. al., Brief Bioinform. 2018
[3] Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis, Bo Li et. al., Sci Rep 2016.
[4] The probable error of the mean, "Student" 1905
[5] Evaluation of statistical techniques to normalize mass spectrometry-based urinary metabolomics data, Cook et. al. 2020, Journal of Pharmaceutical and Biomedical Analysis, 5 Jan 2020
[6] Sample normalization methods in quantitative metabolomics, Wu et. al. 2016, Journal of Chromatography A, 22 Jan 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

normalization

Normalization

Why Normalize?

Current Normalization Procedure

Other Normalization Procedures

Clone this wiki locally