Normalize GP input #42

bielim · 2020-02-20T07:43:55Z

Normalize inputs to the GP training and prediction.

Normalization is done by centering the inputs (i.e., subtracting their mean) and multiplying them by the square root of the inverse of the input covariance.
Whether or not the GP is trained on normalized inputs can be specified with an optional input argument to GPObj ("normalized"), which defaults to true. If the GP has been trained on normalized
inputs, the "predict" function automatically applies the same same normalization when predicting on new inputs.

The idea is that normalization will make the GP hyperparameters more independent of the problem - e.g., the length scales used for the default kernels can be assumed to be reasonable defaults for many problems.

ali-ramadhan · 2020-02-20T12:39:29Z

Normalization is done by centering the inputs (i.e., subtracting their mean) and multiplying them by the square root of the inverse of the input covariance.

Sorry if this is a stupid question but would this work no matter the units of the input or do you have to pick suitable units (potentially non-dimensional)?

Like if the inputs were near-surface temperature profiles in °C would it be bad that tropical data is always very positive while Arctic data can have negative numbers? I guess maybe you want your input data to have units of Kelvin or use potential temperature instead.

odunbar · 2020-02-20T18:42:56Z

@ali-ramadhan. The transformation is applied internally just as a means to aid the interpretation of parameters in the Gaussian Process. You don't have to do anything to your inputs (or units) before giving them as inputs to the GPObj, or the Predict function.

So if you give it a training pair of (3C ,100mm), then when you want to predict at 3C it will still give you answer 100mm. Internally however it transforms the 3C -> X and then trains on (X,100mm)

ali-ramadhan · 2020-02-20T22:34:09Z

Thank you for the clarification @odunbar!

odunbar

Hi Melanie, this all seems reasonable! it may just be me but i've not seen the first.(X) and last.(X) functions (e.g line 200 in GPEmulator.jl) . But if there it's cleaner to keep the mu and sigma variables together then I'm fine with this.

This hopefully conditions the inputs into a nicer space. In a later patch it is probably a good idea for us to also include conditioning on the outputs (e.g SVD) to decorrelate the GPs - we can discuss this later! Thanks for the work!

charleskawczynski · 2020-03-02T18:17:13Z

bors r+

codecov · 2020-03-02T19:11:23Z

Codecov Report

Merging #42 into master will increase coverage by 0.82%.
The diff coverage is 48.14%.

@@            Coverage Diff             @@
##           master      #42      +/-   ##
==========================================
+ Coverage   59.58%   60.41%   +0.82%     
==========================================
  Files          12       12              
  Lines         480      485       +5     
==========================================
+ Hits          286      293       +7     
+ Misses        194      192       -2

Impacted Files	Coverage Δ
src/MCMC.jl	`95.74% <100%> (+16.74%)`	⬆️
src/GPEmulator.jl	`41.66% <46.15%> (+0.85%)`	⬆️
src/Utilities.jl	`33.33% <0%> (-33.34%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f85852c...0d22d17. Read the comment docs.

charleskawczynski

LGTM

bors · 2020-03-02T21:39:35Z

Build succeeded

bielim self-assigned this Feb 20, 2020

bielim requested review from charleskawczynski and odunbar February 20, 2020 07:44

bielim added the enhancement New feature or request label Feb 20, 2020

odunbar approved these changes Feb 22, 2020

View reviewed changes

Normalize GP input

0d22d17

charleskawczynski force-pushed the mb/normalize_gp_input branch from 15c54cb to 0d22d17 Compare March 2, 2020 18:15

charleskawczynski approved these changes Mar 2, 2020

View reviewed changes

bors bot merged commit 35d3feb into master Mar 2, 2020

bors bot deleted the mb/normalize_gp_input branch March 2, 2020 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize GP input #42

Normalize GP input #42

bielim commented Feb 20, 2020

ali-ramadhan commented Feb 20, 2020 •

edited

Loading

odunbar commented Feb 20, 2020

ali-ramadhan commented Feb 20, 2020

odunbar left a comment

charleskawczynski commented Mar 2, 2020

codecov bot commented Mar 2, 2020 •

edited

Loading

charleskawczynski left a comment

bors bot commented Mar 2, 2020

Normalize GP input #42

Normalize GP input #42

Conversation

bielim commented Feb 20, 2020

ali-ramadhan commented Feb 20, 2020 • edited Loading

odunbar commented Feb 20, 2020

ali-ramadhan commented Feb 20, 2020

odunbar left a comment

Choose a reason for hiding this comment

charleskawczynski commented Mar 2, 2020

codecov bot commented Mar 2, 2020 • edited Loading

Codecov Report

charleskawczynski left a comment

Choose a reason for hiding this comment

bors bot commented Mar 2, 2020

Build succeeded

ali-ramadhan commented Feb 20, 2020 •

edited

Loading

codecov bot commented Mar 2, 2020 •

edited

Loading