Improve tuning by skipping the first samples + add new experimental tuning method #5004

aseyboldt · 2021-09-20T11:46:20Z

Tuning the mass matrix starts right away in the current implementation, but in most models during the first couple of samples we are only moving to the typical set, so we do not get information about the posterior variance at all. In the worst case we learn a mass matrix that doesn't match the posterior at all, so that sampling the the first adaptation window will be very slow (you can see this a slowdown of sampling after step 100). Usually, we will recover from this, but it seems to be better to just skip those samples during adaptation in the first place.

In an example model by @ricardoV94 we can see this behavior clearly when we look at the distance of the currently used mass matrix to the final mass matrix:

This PR also contains an experimental tuning implementation using gradients and samples that can be enabled by init="jitter+adapt_diag_grad". During tests on a few models this seems to be more stable than the only sample based tuning system we use right now, but there are also a few cases where it performs worse. For posteriors that are normal it should converge to the same mass matrix as our current implementation (and much faster), but for non-normal posteriors the result can differ. Unfortunately I don't know of any other way to tell which is better other than trying it on a large number of models.

An example notebook can be found here:
https://gist.github.com/aseyboldt/7897fbddacacaa0c86efc917afe9ce3f

twiecki · 2021-09-20T11:55:04Z

pymc3/step_methods/hmc/quadpotential.py

@@ -342,6 +360,8 @@ def __init__(

    def add_sample(self, x, weight):
        x = np.asarray(x)
+        if weight != 1:
+            raise ValueError("weight is unused and broken")


Suggested change

raise ValueError("weight is unused and broken")

raise ValueError("Setting weight != 1 is not supported.")

Or maybe we should just remove it all-together.

twiecki · 2021-09-20T11:57:01Z

pymc3/tests/test_sampling.py

@@ -87,7 +87,7 @@ def test_sample(self):

    def test_sample_init(self):
        with self.model:
-            for init in ("advi", "advi_map", "map"):
+            for init in ("advi", "advi_map", "map", "jitter+adapt_diag_grad"):


Should we add all the others here too?

twiecki · 2021-09-20T11:58:24Z

Needs a line in the release-notes.

rlouf · 2021-09-20T12:43:34Z

I noticed something similar when debugging blackjax's warmup. This is great ! It should also be useful for aehmc.

RELEASE-NOTES.md

pymc3/step_methods/hmc/quadpotential.py

pymc3/sampling.py

ricardoV94 · 2021-09-20T13:30:36Z

pymc3/step_methods/hmc/quadpotential.py

    def add_sample(self, x, weight):
        x = np.asarray(x)
+        if weight != 1:
+            raise ValueError("Setting weight != 1 is not supported.")


Suggested change

def add_sample(self, x, weight):

x = np.asarray(x)

if weight != 1:

raise ValueError("Setting weight != 1 is not supported.")

def add_sample(self, x, weight=None):

if weight is not None:

warning.warn(

"Setting weight is no longer supported and and will raise an error in the future.",

DeprecationWarning,

)

x = np.asarray(x)

I think a hard break is fine here. This really was internal, unused and wrong

Then I would suggest removing the weight argument altogether

pymc3/tests/test_sampling.py

ricardoV94

Left a couple of comments

ricardoV94 · 2021-09-22T00:53:21Z

Thanks @aseyboldt

aseyboldt added 3 commits September 16, 2021 18:25

Fix issue in hmc gradient storage

97033ab

Skip first samples during NUTS adaptation

71db94d

Add test and doc for jitter+adapt_diag_grad

0076325

twiecki mentioned this pull request Sep 20, 2021

Add logistic regression example with NUTS aesara-devs/aehmc#23

Closed

twiecki reviewed Sep 20, 2021

View reviewed changes

twiecki changed the title ~~Improve tuning by skipping the first samples~~ Improve tuning by skipping the first samples + add new experimental tuning method Sep 20, 2021

This comment has been minimized.

Sign in to view

aseyboldt added 2 commits September 20, 2021 14:29

Improve tests of init methods

a4121bc

Add new tuning method to release notes

2b39903

This was referenced Sep 20, 2021

Mass matrix adaptation and other features aesara-devs/aehmc#21

Closed

Parameters returned by the warmup are off blackjax-devs/blackjax#116

Closed

This comment has been minimized.

Sign in to view

Fix quadpotential for short tuning periods

ad45bd8