This repository contains a collection of extensions to the popular
GPML toolbox for Gaussian process inference in MATLAB
, available
here:
http://www.gaussianprocess.org/gpml/code/matlab/doc/
We provide code for:
- Incorporating arbitrary hyperparameter priors into any inference method, allowing for MAP rather than MLE inference during hyperparameter learning.
- An extended API for mean and covariance functions for computing Hessians with respect to their hyperparameters.
- An extended API for inference methods to compute the (potentially approximate) Hessian of the negative log likelihood/posterior with respect to hyperparameters.
- Implementations of several new mean and covariance functions.
- A number of additional utilities, for example, for computing rank-one updates to quickly update a posterior when performing online GP regression.
We establish a new, simple API for specifying arbitrary hyperparameter priors . The API is:
[nlZ, dnlZ, HnlZ] = prior(hyperparameters)
Where the input is:
and the outputs are
nlZ
: the negative of the log prior evaluated at ,dnlZ
: a struct containing the gradient of the negative log prior evaluated at ,HnlZ
: (optional) a struct containing the Hessian of the negative log prior evaluated at ,
The dnlZ
struct is specified in the same way as is typical for GPML
(for example, as the second output of gp.m
in training mode), and,
if needed, the Hessian is specified as described in the section on
Hessians below.
We provide an implementation of a flexible family of such priors in
independent_prior.m
. This implements a meta-prior of the form
where we have placed independent priors on each hyperparameter . Several elementwise priors are provided, including:
- normal priors (see
gaussian_prior
): - uniform priors (see
uniform_prior
): - Laplace priors (see
laplace_prior
): - improper constant priors (see
constant_prior
):
Once a hyperparameter prior is specified, a special meta-inference
method, inference_with_prior
allows the user to incorporate the
prior into any arbitrary GPML inference method. Except for extra
inputs specifying the inference method and prior, the API of
inference_with_prior
is identical to the standard GPML inference
method API, except that the negative log likelihood nlZ
, its
gradient dnlZ
, and, optionally, its Hessian HnlZ
, are replaced
with the equivalent expressions for the negative (unnormalized) log
posterior .
Here is a demonstration of incorporating a hyperparameter prior to a GPML model:
inference_method = @infExact;
mean_function = {@meanConst};
covariance_function = {@covSEiso};
% initial hyperparameters
offset = 1;
length_scale = 1;
output_scale = 1;
noise_std = 0.05;
hyperparameters.mean = offset;
hyperparameters.cov = log([length_scale; output_scale]);
hyperparameters.lik = log(noise_std);
% add normal priors to each hyperparameter
priors.mean = {get_prior(@gaussian_prior, 0, 1)};
priors.cov = {get_prior(@gaussian_prior, 0, 1), ...
get_prior(@gaussian_prior, 0, 1)};
priors.lik = {get_prior(@gaussian_prior, log(0.01), 1)};
% add prior to inference method
prior = get_prior(@independent_prior, priors);
inference_method = add_prior_to_inference_method(inference_method, prior);
% find MAP hyperparameters
map_hyperparameters = minimize(hyperparameters, @gp, 50, inference_method, ...
mean_function, covariance_function, [], x, y);
We establish a simple extension to the GPML API for mean and covariance functions, allowing us to compute their Hessians with respect to their hyperparameters.
To compute the second (mixed) partial derivative of with respect to the pair ,
the interface is:
mu = mean_function(hyperparameters, x, i, j)
which differs from the interface for computing gradients only by the
additional input argument j
. Several mean function implementations
compliant with this extended interface are provided:
zero_mean
: a drop-in replacement formeanZero
constant_mean
: a drop-in replacement formeanConst
linear_mean
: a drop-in replacement formeanLinear
Similarly, to compute the second (mixed) partial derivative of with respect to the pair ,
the interface is:
K = covariance_function(hyperparameters, x, z, i, j)
Several covariance function implementations compliant with this extended interface are provided:
isotropic_sqdexp_covariance
: a drop-in replacement forcovSEiso
ard_sqdexp_covariance
: a drop-in replacement forcovSEard
factor_sqdexp_covariance
: an implementation of a squared exponential "factor" covariance, where an isotropic squared exponential is applied to data after a linear map to a lower-dimensional space.
These Hessians can ultimately be used to compute the Hessian of the log likelihood with respect to the hyperparameters. In particular, we provide:
exact_inference
: a drop-in replacement forinfExact
laplace_inference
: a drop-in replacement forinfLaplace
.
Both support the extended inference method API
[posterior, nlZ, dnlZ, HnlZ] = ...
inference_method(hyperparameters, mean_function, ...
covariance_function, likelihood, x, y);
The last output, HnlZ
, is a struct describing the Hessian of the
negative log likelihood with respect to , including with
respect to "off-block-diagonal" terms such as mean/covariance,
mean/likelihood, and covariance/likelihood hyperparameter pairs. See
hessians.m
for a description of this struct.
A number of additional files are included, providing additional functionality. These include:
- New mean functions:
step_mean
: a simple "step" changepoint meandiscrete_mean
/fixed_discrete_mean
: free-form mean vectors for discrete data;discrete_mean
treats the entries of this vector as hyperparameters, enabling learning.
- New covariance functions:
discrete_covariance
/fixed_discrete_covariance
: free-form covariance matrices for discrete data;discrete_covariance
treats the entries of this matrix as hyperparameters, enabling learning. A log-Cholesky parameterization is used, allowing for unconstrained optimization.scaled_covariance
: a meta-covariance for modeling functions of the form , where and is a fixed function. is specified as a GPML mean function, and is specified as a GPML covariance function.scaled_covariance
computes the gradient of the covariance with respect to both the parameters of and .
- Rank-one updates of GPML posterior structs:
update_posterior
allows the user to update an existing GPML posterior struct (for regression with Gaussian observation noise) given a single new observation . This can significantly decrease the total time needed to perform sequential online GP regression. - Computing likelihoods assuming datasets are from independent draws
from a joint GP prior: see
gp_likelihood_independent
for more information.