Laplace approximation sampler #3146

bob-carpenter · 2022-10-14T21:50:09Z

Summary:

Provide a service function for sampling from a Laplace approximation. This will allow cmdstanr to reproduce the Laplace approximation functionality of RStan.

Description:

The following signature should work. It requires an estimate theta of the mode (for example, as computed by optimization).

template <class Model>
int laplace_sample(const Model& m,
  bool include_jacobian,
  const Eigen::VectorXd& theta_hat, 
  int num_draws,
  unsigned int random_seed,
  int refresh,
  callbacks::interrupt& interrupt,
  callbacks::logger& logger,
  callbacks::writer& sample_writer);

The computation will follow RStan. The Hessian will be computed using central finite differences of the model functor. It cannot use autodiff because not all of our functions have second-order derivatives.

central finite differences Hessian functional: stan::math::finite_diff_hessian_auto in stan/math/rev/functor/finite_diff_hessian_auto.hpp
model functor: stan::model::model_functional in stan/model/model_functional.hpp

The algorithm has a high constant overhead, requiring a one-time cost of $\mathcal{O}(N^3)$ in $N$ dimensions to compute the Cholesky factor of the negative inverse Hessian. It also requires $2N$ gradient calculations to use as the basis, which scales at best as $\mathcal{O}(N^2)$ and is worse for models whose gradient calculation is super-linear in dimension.

The algorithm also a high per-draw overhead, requiring $N$ standard normal pseudorandom numbers and $\mathcal{O}(N^2)$ per draw (to multiply by Chokesky factor).

For $M$ draws, the total cost is proportional to $\mathcal{O}(N^3 + M \cdot N^2)$. It generates output of size $\mathcal{O}(M \cdot N)$.

Current Version:

v2.30.0

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2022-10-14T21:59:15Z

This is not a duplicate of #2522, which is about specific functional forms. This one is black-box.

bob-carpenter · 2022-10-14T22:19:14Z

The relevant RStan code we will follow is:

if (draws > 0 && is.matrix(R <- try(chol(-H)))) {
  K <- ncol(R)
  R_inv <- backsolve(R, diag(K))
  Z <- matrix(rnorm(K * draws), K, draws)
  theta_tilde <- t(theta + R_inv %*% Z)

Converting this to mathematical notation, first suppose that the mode is $\widehat{\theta}$ and $H(\widehat{\theta})$ is the Hessian at the mode calculated with finite differences. Then comes the high constant-overhead part, where we (once) set

$R^{-1} = \textrm{chol}(-H(\widehat{\theta})) \backslash \mathbf{1}$.

Then we can generate each draw $m$ on the unconstrained scale by sampling

$\theta^{\textrm{std}(m)} \sim \textrm{normal}(0, \textrm{I})$

and defining draw $m$ to be

$\theta^{(m)} = \widehat{\theta} + R^{-1} \cdot \theta^{\textrm{std}(m)}$

We then just need to inverse transform each $\theta^{(m)}$ back to the constrained scale, compute transformed parameters, and generate the generated quantities using the Stan model method write_array().

[GitHub hasn't sorted rendering for LaTeX properly, so it still looks like a ransom note!]

avehtari · 2022-11-17T16:59:22Z

I missed this earlier.

This will allow cmdstanr to reproduce the Laplace approximation functionality of RStan.

Currently, RStan is doing a wrong thing, that is, not using the Jacobian adjustment. See illustration of RStan approach and correct Laplace approximation in https://avehtari.github.io/casestudies/Jacobian/jacobian.html

The following signature should work. It requires an estimate theta of the mode (for example, as computed by optimization).

optimization is not using the Jacobian, and thus the estimate computed by optimization is not useful (see the notebook linked above). We want the mode of the Jacobian adjusted target in the unconstrained space.

Furthermore, it would be useful to be able to have the possibility to get the computed Hessian instead of sample

bob-carpenter · 2022-11-17T20:42:32Z

Thanks, @avehtari. You're in luck. I just made a PR to add the Jacobian-adjusted optimizer, so we can get proper Laplace approximations over the Bayesian posterior. So this should all be in the next release of CmdStan and hopefully CmdStanPy and CmdStanR. So now we can get MLE plus std error approximation without Jacobian and MAP plus approximate posteriors in the Bayesian case.

bob-carpenter added feature algorithm new function labels Oct 14, 2022

mitzimorris mentioned this issue Oct 21, 2022

Feature/3146 laplace approx #3148

Merged

3 tasks

bob-carpenter closed this as completed in 3ddb8d2 Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Laplace approximation sampler #3146

Laplace approximation sampler #3146

bob-carpenter commented Oct 14, 2022 •

edited

Loading

bob-carpenter commented Oct 14, 2022

bob-carpenter commented Oct 14, 2022

avehtari commented Nov 17, 2022

bob-carpenter commented Nov 17, 2022

Laplace approximation sampler #3146

Laplace approximation sampler #3146

Comments

bob-carpenter commented Oct 14, 2022 • edited Loading

Summary:

Description:

Current Version:

bob-carpenter commented Oct 14, 2022

bob-carpenter commented Oct 14, 2022

avehtari commented Nov 17, 2022

bob-carpenter commented Nov 17, 2022

bob-carpenter commented Oct 14, 2022 •

edited

Loading