Latent component Gaussian process (LCGP)

Implementation of latent component Gaussian process (LCGP). LCGP handles the emulation of multivariate stochastic simulation outputs.

Reference

The development of this work is described fully in the following work, cited as:

@phdthesis{chan2023thesis,
  author  = "Moses Y.-H. Chan",
  title   = "High-Dimensional Gaussian Process Methods for Uncertainty Quantification",
  school  = "Northwestern University",
  year    = "2023",
}

List of Contents:

Installation
Basic Usage

Installation

The implementation of LCGP can be installed through

pip install lcgp

Note on LBFGS optimizer:

It is strongly recommended that PyTorch-LBFGS is installed to fully utilize this implementation. Installation guide on PyTorch-LBFGS can be found on its repository.

Test suite

A list of basic tests is provided for user to verify that LCGP is installed correctly. Execute the follow code within the root directory:

pip install pytest pytest-cov
pytest

Basic usage

What most of us need:

import numpy as np
from lcgp import LCGP
from lcgp import evaluation  # optional evaluation module

# Generate fifty 2-dimensional input and 4-dimensional output
x = np.random.randn(50, 2)
y = np.random.randn(4, 50)

# Define LCGP model
model = LCGP(y=y, x=x)

# Estimate error covariance and hyperparameters
model.fit()

# Prediction
p = model.predict(x0=x)  # mean and variance
rmse = evaluation.rmse(y, p[0].numpy())
dss = evaluation.dss(y, p[0].numpy(), p[1].numpy(), use_diag=True)
print('Root mean squared error: {:.3E}'.format(rmse))
print('Dawid-Sebastiani score: {:.3f}'.format(dss))

# Access parameters
print(model)

Specifying number of latent components

There are two ways to specify the number of latent components by passing one of the following arguments in initializing an LCGP instance:

q = 5: Five latent components will be used. q must be less than or equal to the output dimension.
var_threshold = 0.99: Include q latent components such that 99% of the output variance are explained, using a singular value decomposition.

Note: Only one of the options should be provided at a time.

model_q = LCGP(y=y, x=x, q=5)
model_var = LCGP(y=y, x=x, var_threshold=0.99)

Specifying diagonal error groupings

If errors of multiple output dimensions are expected to be similar, the error variances can be grouped in estimation.

For example, the 6-dimensional output is split into two groups: the first two have low errors and the remaining four have high errors.

import numpy as np

x = np.linspace(0, 1, 100)
y = np.row_stack((
    np.sin(x), np.cos(x), np.tan(x),
    np.sin(x/2), np.cos(x/2), np.tan(x/2)
))

y[:2] += np.random.normal(2, 1e-3, size=(2, 100))
y[2:] += np.random.normal(-2, 1e-1, size=(4, 100))

Then, LCGP can be defined with the argument diag_error_structure as a list of output dimensions to group. The following code groups the first 2 and the remaining 4 output dimensions.

model_diag = LCGP(y=y, x=x, diag_error_structure=[2, 4])

By default, LCGP assigns a separate error variance to each dimension, equivalent to

model_diag = LCGP(y=y, x=x, diag_error_structure=[1]*6)

Define LCGP using different submethod

Three submethods are implemented under LCGP:

Full posterior (full)
ELBO (elbo)
Profile likelihood (proflik)

Under circumstances where the simulation outputs are stochastic, the full posterior approach should perform the best. If the simulation outputs are deterministic, the profile likelihood method should suffice.

LCGP_models = []
submethods = ['full', 'elbo', 'proflik']
for submethod in submethods:
    model = LCGP(y=y, x=x, submethod=submethod)
    LCGP_models.append(model)

Standardization choices

LCGP standardizes the simulation output by each dimension to facilitate hyperparameter training. The two choices are implemented through robust_mean = True or robust_mean = False.

robust_mean = False: The empirical mean and standard deviation are used.
robust_mean = True: The empirical median and median absolute error are used.

model = LCGP(y=y, x=x, robust_mean=False)

Name		Name	Last commit message	Last commit date
Latest commit History 411 Commits
.github/workflows		.github/workflows
docs		docs
experiments		experiments
illustration-examples		illustration-examples
lcgp		lcgp
real-data-example		real-data-example
reference_code		reference_code
submethod-illustration		submethod-illustration
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CITATION.cff		CITATION.cff
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
Makefile		Makefile
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Latent component Gaussian process (LCGP)

Reference

Installation

Test suite

Basic usage

What most of us need:

Specifying number of latent components

Specifying diagonal error groupings

Define LCGP using different submethod

Standardization choices

About

Releases 3

Packages

Languages

License

mosesyhc/LCGP

Folders and files

Latest commit

History

Repository files navigation

Latent component Gaussian process (LCGP)

Reference

Installation

Test suite

Basic usage

What most of us need:

Specifying number of latent components

Specifying diagonal error groupings

Define LCGP using different submethod

Standardization choices

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages