|
1 |
| -# Fast Simulation of Hyperplane-Truncated Multivatiate Normal Distributions |
| 1 | +# htnorm |
2 | 2 |
|
3 |
| -An implementation of a fast and exact simulation algorithm for a multivariate normal distribution truncated on the intersection of a set of hyperplanes. |
| 3 | +This repo provides a C implementation of a fast and exact sampler from a |
| 4 | +multivariate normal distribution (MVN) truncated on a hyperplane as described [here][1] |
| 5 | + |
| 6 | +this repo implements the following from the paper: |
| 7 | + |
| 8 | +- efficient Sampling from a MVN truncated on a hyperplane: |
| 9 | + |
| 10 | +  |
| 11 | + |
| 12 | +- efficient sampling from a MVN with a stuctured precision matrix: |
| 13 | + |
| 14 | +  |
| 15 | + |
| 16 | +- efficent sampling frfom a MVN with a structured precision and mean: |
| 17 | + |
| 18 | +  |
| 19 | + |
| 20 | +The algorithms implemented have the following practical applications: |
| 21 | +- Topic models when unknown parameters can be interpreted as fractions. |
| 22 | +- Admixture models |
| 23 | +- discrete graphical models |
| 24 | +- Sampling from posterior distribution of an Intrinsic Conditional Autoregressive prior [icar][8] |
| 25 | +- Sampling from posterior conditional distributions of various bayesian regression problems. |
| 26 | + |
| 27 | + |
| 28 | +## Dependencies |
| 29 | + |
| 30 | +- a C compiler that supports the C99 standard or later |
| 31 | +- an installation of BLAS and LAPACK that exposes its C interface via the headers `<cblas.h>` and `<lapacke.h>` |
| 32 | +(e.g openBLAS). |
| 33 | + |
| 34 | + |
| 35 | +## Usage |
| 36 | + |
| 37 | +Building a shared library of `htnorm` can be done with the following: |
| 38 | +```bash |
| 39 | +$ cd src/ |
| 40 | +# optionally set path to CBLAS and LAPACKE headers using INCLUDE_DIRS environmental variable |
| 41 | +$ export INCLUDE_DIRS="some/path/to/headers" |
| 42 | +# optionally set path to BLAS installation shared library |
| 43 | +$ export LIBS_DIR="some/path/to/library/" |
| 44 | +# optionally set the linker flag for your BLAS installation (e.g -lopenblas) |
| 45 | +$ export LIBS=<flag here> |
| 46 | +$ make lib |
| 47 | +``` |
| 48 | +Afterwards the shared library will be found in a `lib/` directory of the project root, |
| 49 | +and the library can be linked dynamically via `-lhtnorm`. |
| 50 | + |
| 51 | +The puplic API exposes the samplers through the function declarations |
| 52 | +- `int htn_hyperplane_truncated_mvn(rng_t* rng, const ht_config_t* conf, double* out)` |
| 53 | +- `int htn_structured_precision_mvn(rng_t* rng, const sp_config_t* conf, double* out)` |
| 54 | + |
| 55 | +The details of the parameters are documented in ther header files ["htnorm.h"][4]. |
| 56 | + |
| 57 | +Random number generation is done using [PCG64][2] or [Xoroshiro128plus][3] bitgenerators. |
| 58 | +The API allows using a custom generator, and the details are documented in the header file |
| 59 | +["rng.h"][5]. |
| 60 | + |
| 61 | +## Examples |
| 62 | +```C |
| 63 | +#include "htnorm.h" |
| 64 | + |
| 65 | +int main () |
| 66 | +{ |
| 67 | + ... |
| 68 | + |
| 69 | + // instantiate a random number generator |
| 70 | + rng_t* rng = rng_new_pcg64(); |
| 71 | + ht_config_t config; |
| 72 | + config.g = ...; // G matrix |
| 73 | + config.gnrow = ...; // number of rows of G |
| 74 | + config.gncol = ...; // number of columns of G |
| 75 | + cofig.r = ...; // r array |
| 76 | + config.mean = ...; // mean array |
| 77 | + config.cov = ...; // the convariance matrix |
| 78 | + confi.diag = ...; // whether covariance is diagonal |
| 79 | + |
| 80 | + double* samples = ...; // array to store the samples |
| 81 | + // now call the sampler |
| 82 | + int res_info = htn_hyperplane_truncated_mvn(rng, &config, samples); |
| 83 | + |
| 84 | + // res_info contains a number that indicates whether sampling failed or not. |
| 85 | + |
| 86 | + ... |
| 87 | + |
| 88 | + // finally free the RNG pointer at some point |
| 89 | + rng_free(rng); |
| 90 | + |
| 91 | + ... |
| 92 | + return 0; |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +## Python API |
| 97 | + |
| 98 | +A high level python interface to the library is also provided. Installing it from |
| 99 | +source requires an installation of [poetry][7] and the following shell commands: |
| 100 | + |
| 101 | +```bash |
| 102 | +$ git clone https://github.com/zoj613/htnorm.git |
| 103 | +$ cd htnorm/ |
| 104 | +$ poetry install |
| 105 | +# add htnorm to python's path |
| 106 | +$ export PYTHONPATH=$PWD:$PYTHONPATH |
| 107 | +``` |
| 108 | +Below is an example of how to use htnorm in python to sample from a multivariate |
| 109 | +gaussian truncated on the hyperplane  (i.e. making sure the sampled values sum to zero) |
| 110 | + |
| 111 | +```python |
| 112 | +from pyhtnorm import HTNGenerator |
| 113 | +import numpy as np |
| 114 | + |
| 115 | +rng = HTNGenerator() |
| 116 | + |
| 117 | +# generate example input |
| 118 | +k1 = 1000 |
| 119 | +k2 = 1 |
| 120 | +npy_rng = np.random.default_rng() |
| 121 | +temp = npy_rng.random((k1, k1)) |
| 122 | +cov = a @ a.T + np.diag(npy_rng.random(k1)) |
| 123 | +G = np.ones((k2, k1)) |
| 124 | +r = np.zeros(k1) |
| 125 | +mean = npy_rng.random(k1) |
| 126 | + |
| 127 | +samples = rng.hyperplane_truncated_mvnorm(mean, cov, G, r) |
| 128 | +# verify if sampled values sum to zero |
| 129 | +print(sum(samples)) |
| 130 | + |
| 131 | +# alternatively one can pass an array to store the results in |
| 132 | +out = np.empty(k1) |
| 133 | +rng.hyperplane_truncated_mvnorm(mean, cov, G, r, out=out) |
| 134 | +# verify |
| 135 | +print(out.sum()) |
| 136 | +``` |
| 137 | + |
| 138 | +For more details about the parameters of the `HTNGenerator` and its methods, |
| 139 | +see the docstrings via python's `help` function. |
| 140 | + |
| 141 | +The python API also exposes the `HTNGenerator` class as a Cython extension type |
| 142 | +that can be "cimported" in a cython script. |
| 143 | + |
| 144 | + |
| 145 | +## TODO |
| 146 | + |
| 147 | +- Add an `R` interface to the library. |
| 148 | + |
| 149 | + |
| 150 | +## Licensing |
| 151 | + |
| 152 | +`htnorm` is free software made available under the BSD-3 License. For details |
| 153 | +see the [LICENSE][6] file. |
| 154 | + |
| 155 | + |
| 156 | +## References |
| 157 | +- Cong, Yulai; Chen, Bo; Zhou, Mingyuan. Fast Simulation of Hyperplane-Truncated |
| 158 | + Multivariate Normal Distributions. Bayesian Anal. 12 (2017), no. 4, 1017--1037. |
| 159 | + doi:10.1214/17-BA1052. |
| 160 | +- Bhattacharya, A., Chakraborty, A., and Mallick, B. K. (2016). |
| 161 | + “Fast sampling with Gaussian scale mixture priors in high-dimensional regression.” |
| 162 | + Biometrika, 103(4):985. |
| 163 | + |
| 164 | + |
| 165 | +[1]: https://projecteuclid.org/euclid.ba/1488337478 |
| 166 | +[2]: https://www.pcg-random.org/ |
| 167 | +[3]: https://en.wikipedia.org/wiki/Xoroshiro128%2B |
| 168 | +[4]: https://github.com/zoj613/htnorm/blob/main/include/htnorm.h |
| 169 | +[5]: https://github.com/zoj613/htnorm/blob/main/include/rng.h |
| 170 | +[6]: https://github.com/zoj613/htnorm/blob/main/LICENSE |
| 171 | +[7]: https://python-poetry.org/docs/pyproject/ |
| 172 | +[8]: https://www.sciencedirect.com/science/article/abs/pii/S1877584517301600 |
0 commit comments