-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular emulator interface #120
Conversation
Codecov Report
@@ Coverage Diff @@
## master #120 +/- ##
==========================================
+ Coverage 90.41% 91.81% +1.40%
==========================================
Files 4 4
Lines 386 391 +5
==========================================
+ Hits 349 359 +10
+ Misses 37 32 -5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
bors r+ |
120: Modular emulator interface r=odunbar a=odunbar ## Purpose Reduces the current Emulator interface dependence of Gaussian Processes. Now the GP can be swapped for another statistical emulator. ## In the PR - [x] New general `Emulator` class. This handles all the data manipulation e.g. Normalization, Standardization, Decorrelation - [x] General interface functions for Emulator, `optimize_hyperparameters!`, `predict` - [x] New `MachineLearningTool` type, - [x] Moved the Gaussian Processes into a `GaussianProcess <: MachineLearningTool` class - [x] Example (e.g. `plot_GP`) to demonstrate the new interface - [x] Unit tests - [x] New doc strings. ## Additional change Seems to be ongoing issues with unit testing Julia 1.5.4, so I have updated the Manifest, Docs.yml and Test.yml to Julia 1.6.X ## Changes to user experience: Ingredients: ```julia gppackage = GPJL() pred_type = YType() GPkernel = ... iopairs = PairedDataContainer(x_data,y_data) ``` ### Old interface Set up a `GaussianProcessEmulator` object ```julia gp = GaussianProcess( iopairs, gppackage; GPkernel=GPkernel, obs_noise_cov=nothing, normalized=false, noise_learn=true, truncate_svd=1.0, standardize=false, prediction_type=pred_type, norm_factor=nothing) ``` Then predict with it. ```julia μ, σ² = GaussianProcessEmulator.predict(gp, new_inputs) ``` It is short, but it inherently is stuck to the Gaussian process framework. It also hides e.g. the training away, and we may wish to have this more open. The script below is more general, separating out which parameters are related to data processing and which relate to the specific ML tool. ### New interface Setup a `GaussianProcess<:MachineLearningTool` object ```julia gp = GaussianProcess( gppackage; kernel=GPkernel, noise_learn=true, prediction_type=pred_type) ``` and then create the general emulator type using `gp` ```julia em = Emulator( gp, iopairs, obs_noise_cov=nothing, normalize_inputs=false, standardize_outputs=false, truncate_svd=1.0) ``` Train and predict ```julia Emulators.optimize_hyperparameters!(em) μ, σ² = Emulators.predict(em, new_inputs) ``` ### Adding a new `MachineLearningTool` Include a new file `NewTool.jl` at the top of `Emulator.jl` In this file define: 1. `struct NewTool <: MachineLearningTool` with constructor `NewTool(...)` to hold ML parameters and models 2. `function build_models!(NewTool,iopairs)` to build and store ML models. Called in Emulator constructor 3. `function optimize_hyperparameters!(NewTool)`to train the stored ML models. Called by method of same name in Emulator 4. `function predict(NewTool,new_inputs)` to predict with stored ML models Called by method of same name in Emulator Co-authored-by: odunbar <odunbar@caltech.edu>
Build failed: |
bors r+ |
120: Modular emulator interface r=odunbar a=odunbar ## Purpose Reduces the current Emulator interface dependence of Gaussian Processes. Now the GP can be swapped for another statistical emulator. ## In the PR - [x] New general `Emulator` class. This handles all the data manipulation e.g. Normalization, Standardization, Decorrelation - [x] General interface functions for Emulator, `optimize_hyperparameters!`, `predict` - [x] New `MachineLearningTool` type, - [x] Moved the Gaussian Processes into a `GaussianProcess <: MachineLearningTool` class - [x] Example (e.g. `plot_GP`) to demonstrate the new interface - [x] Unit tests - [x] New doc strings. ## Additional change Seems to be ongoing issues with unit testing Julia 1.5.4, so I have updated the Manifest, Docs.yml and Test.yml to Julia 1.6.X ## Changes to user experience: Ingredients: ```julia gppackage = GPJL() pred_type = YType() GPkernel = ... iopairs = PairedDataContainer(x_data,y_data) ``` ### Old interface Set up a `GaussianProcessEmulator` object ```julia gp = GaussianProcess( iopairs, gppackage; GPkernel=GPkernel, obs_noise_cov=nothing, normalized=false, noise_learn=true, truncate_svd=1.0, standardize=false, prediction_type=pred_type, norm_factor=nothing) ``` Then predict with it. ```julia μ, σ² = GaussianProcessEmulator.predict(gp, new_inputs) ``` It is short, but it inherently is stuck to the Gaussian process framework. It also hides e.g. the training away, and we may wish to have this more open. The script below is more general, separating out which parameters are related to data processing and which relate to the specific ML tool. ### New interface Setup a `GaussianProcess<:MachineLearningTool` object ```julia gp = GaussianProcess( gppackage; kernel=GPkernel, noise_learn=true, prediction_type=pred_type) ``` and then create the general emulator type using `gp` ```julia em = Emulator( gp, iopairs, obs_noise_cov=nothing, normalize_inputs=false, standardize_outputs=false, truncate_svd=1.0) ``` Train and predict ```julia Emulators.optimize_hyperparameters!(em) μ, σ² = Emulators.predict(em, new_inputs) ``` ### Adding a new `MachineLearningTool` Include a new file `NewTool.jl` at the top of `Emulator.jl` In this file define: 1. `struct NewTool <: MachineLearningTool` with constructor `NewTool(...)` to hold ML parameters and models 2. `function build_models!(NewTool,iopairs)` to build and store ML models. Called in Emulator constructor 3. `function optimize_hyperparameters!(NewTool)`to train the stored ML models. Called by method of same name in Emulator 4. `function predict(NewTool,new_inputs)` to predict with stored ML models Called by method of same name in Emulator Co-authored-by: odunbar <odunbar@caltech.edu>
Build failed: |
Hi all -- the error message for this PR matches that discussed in issue #125, with a root cause of using a julia 1.6 Manifest with CI scripts that use julia 1.5. This PR updated 1.5 -> 1.6 in |
bors try |
tryBuild failed: |
tryBuild failed: |
tryBuild failed: |
Apologies for my confusion over which branch to add features to. As discussed with @odunbar :
|
* Fix path to examples/ci * Suppress warnings from reused plot variables in plot_GP.jl * Restore learn_noise.jl from /master * Fix comparisions to nothing * Update learn_noise example to use Emulator() * Always optimize! GPs with noise_learn=false Done because noise is explicitly added to the GP kernel when it's created. This is needed to reproduce existing behavior in /master. * Fix reverse_standardize() * Update examples to use Emulators * Fix buildkite path to GaussianProcess example * Fix adding top-level repo LOAD_PATH in GP examples * Regenerate all Manifests under julia 1.6.5 * Explicit compatibility with julia 1.6.x * Temporarily use julia 1.6.2 in buildkite
bors try |
tryBuild succeeded: |
bors r+ |
Build succeeded: |
130: [WIP] Modular Sampler interface r=tsj5 a=tsj5 This PR re-implements the Sample step of CES to use the [AbstractMCMC](https://github.com/TuringLang/MCMCChains.jl) interface used by [Turing.jl](https://turing.ml/dev/). It may be considered a sibling of PR #120. **Motivation** The rationale for doing this is as follows (most relevant reasons first): 1. We shouldn't reinvent the wheel here, as CES doesn't claim to innovate in the area of MCMC sampling (taken on its own). a. It's reasonable to assume a user of our package who has done MCMC in Julia is familiar with Turing and its interface: Turing.jl is a major part of the Julia ecosystem, playing the role [stan](https://mc-stan.org/) does for R. b. Extensibility via this interface is a design goal of Turing, so one may assume its developers have thought more about how to best design an appropriate API (e.g. [thread](https://github.com/TuringLang/AbstractMCMC.jl/discussions/72)). The separation of concerns used by AbstractMCMC (bullet list below) is the logical way to split up the problem. 2. In practice (the examples), it streamlines code: `MCMCWrapper` objects can be reused, and constructor arguments are taken from the `Emulator` to avoid potential bugs from inconsistencies. 2. MCMCChains implements several diagnostics for MCMC convergence, in the form of [statistics](https://turinglang.github.io/MCMCChains.jl/dev/diagnostics/) and [plots](https://turinglang.github.io/MCMCChains.jl/dev/statsplots/). 3. AbstractMCMC implements thread- and process-parallel sampling of multiple chains. MCMC folklore is that one obtains the most robust estimates by running "a few" "medium-length" chains from different initial conditions, e.g. [Gilks et. al. 1996](https://books.google.com/books/about/Markov_Chain_Monte_Carlo_in_Practice.html?id=ATimDAEACAAJ). 4. Interoperability with all of Turing.jl; this may be useful in the future if the package adds samplers or other features we'd like to use off-the-shelf. **Implementation** AbstractMCMC is described in the Turing docs [here](https://turing.ml/dev/docs/for-developers/interface) and [here](https://turing.ml/dev/docs/for-developers/how_turing_implements_abstractmcmc), although this is brief, and reading the [AdvancedMH](https://github.com/TuringLang/AdvancedMH.jl) source was more enlightening. In addition to AbstractMCMC, the PR uses the [AdvancedMH](https://github.com/TuringLang/AdvancedMH.jl) extensible implementation of Metropolis-Hastings, and [MCMCChains](https://github.com/TuringLang/MCMCChains.jl) for storing sampling runs and sampling from the posterior. Turing.jl itself isn't brought in as a dependency. This PR splits up the functionality in the existing `MCMC` class into three classes: - A `AdvancedMH.DensityModel` child class which computes log-likelihood. This simply wraps the Emulator instance. - A `AdvancedMH.Proposal` child class which generates proposal moves for Metropolis-Hastings. This allows us to plug in different sampling algorithms, such as preconditioned Crank-Nicholson (PR #124; not done here). - `MCMCChains.Chain`, a struct returned as the sampler output. By not storing the sampling results in the `MCMCWrapper` object, it can be reused to configure multiple MCMC runs. The new `MCMCWrapper` class simply wraps the first two objects, performing the same standardization that was performed by the Emulator (to ensure consistency, this information is taken from the Emulator instance itself). `sample_posterior!` is replaced by new methods for `sample`, which return instances of `MCMCChains.Chain`. Co-authored-by: Thomas Jackson <tom.jackson314@gmail.com>
Purpose
Reduces the current Emulator interface dependence of Gaussian Processes. Now the GP can be swapped for another statistical emulator.
In the PR
Emulator
class. This handles all the data manipulation e.g. Normalization, Standardization, Decorrelationoptimize_hyperparameters!
,predict
MachineLearningTool
type,GaussianProcess <: MachineLearningTool
classplot_GP
) to demonstrate the new interfaceAdditional change
Seems to be ongoing issues with unit testing Julia 1.5.4, so I have updated the Manifest, Docs.yml and Test.yml to Julia 1.6.X
Changes to user experience:
Ingredients:
Old interface
Set up a
GaussianProcessEmulator
objectThen predict with it.
It is short, but it inherently is stuck to the Gaussian process framework. It also hides e.g. the training away, and we may wish to have this more open. The script below is more general, separating out which parameters are related to data processing and which relate to the specific ML tool.
New interface
Setup a
GaussianProcess<:MachineLearningTool
objectand then create the general emulator type using
gp
Train and predict
Adding a new
MachineLearningTool
Include a new file
NewTool.jl
at the top ofEmulator.jl
In this file define:
struct NewTool <: MachineLearningTool
with constructorNewTool(...)
to hold ML parameters and modelsfunction build_models!(NewTool,iopairs)
to build and store ML models. Called in Emulator constructorfunction optimize_hyperparameters!(NewTool)
to train the stored ML models. Called by method of same name in Emulatorfunction predict(NewTool,new_inputs)
to predict with stored ML models Called by method of same name in Emulator