Make Metropolis cope better with multiple dimensions #5823

ricardoV94 · 2022-05-30T13:22:36Z

Metropolis now updates each dimension sequentially and tunes a proposal scale parameter per dimension

The performance of the Metropolis sampler with multiple dimensions was incredibly poor as highlighted in this old closed issue #4223. After these changes, the performance should now be similar between the scalar and batched cases.

Some care had to be taken towards multivariate discrete distributions where we cannot safely update one dimension at a time (e.g., multinomial), because any intermediate change would result in -inf logp. We should probably implement a smarter sampler just for this distribution, but for the time being we avoid doing elemwise updates when multivariate discrete variables are assigned.

codecov · 2022-05-30T13:43:52Z

Codecov Report

Merging #5823 (f27dc98) into main (57654dc) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head f27dc98 differs from pull request most recent head f006eda. Consider uploading reports for the commit f006eda to get more accurate results

@@            Coverage Diff             @@
##             main    #5823      +/-   ##
==========================================
- Coverage   89.40%   89.38%   -0.02%     
==========================================
  Files          74       74              
  Lines       13769    13783      +14     
==========================================
+ Hits        12310    12320      +10     
- Misses       1459     1463       +4

Impacted Files	Coverage Δ
pymc/step_methods/metropolis.py	`83.36% <100.00%> (+0.38%)`	⬆️
pymc/step_methods/mlda.py	`96.37% <100.00%> (ø)`
pymc/parallel_sampling.py	`86.46% <0.00%> (-1.00%)`	⬇️
pymc/step_methods/hmc/base_hmc.py	`89.68% <0.00%> (-0.80%)`	⬇️
pymc/distributions/logprob.py	`97.65% <0.00%> (ø)`
pymc/aesaraf.py	`91.95% <0.00%> (+0.06%)`	⬆️

pymc/step_methods/metropolis.py

ricardoV94 · 2022-05-31T07:25:26Z

pymc/tests/test_step.py

+                # This used to be a scrict zero equality!
+                assert np.isclose(Q_1_0.mean(axis=1), 0.0, atol=1e-4)
+                assert np.isclose(Q_2_1.mean(axis=1), 0.0, atol=1e-4)


I don't know who is maintaining the MLDA step method, but I had to change this strict equality... Is that okay?

Nobody is actively maintaining it. Also, didn't we want to move it to pymc-experimental?

I think it is still maintained as it was ported by the original team to V4

ricardoV94 · 2022-05-31T07:26:18Z

pymc/step_methods/metropolis.py

-        return scale * 1.1
-
-    return scale
+    return np.exp(np.log(scale) + (acc_rate - 0.234))


@aloctavodia Is this correct?

Yes, also the initial guess for the scale should be something like this

pymc/pymc/smc/smc.py

Lines 461 to 462 in 5703a9d

ndim = self.tempered_posterior.shape[1]

self.proposal_scales = np.full(self.draws, min(1, 2.38**2 / ndim))

(for problem with a few dimensions is the same as starting from 1)

We can make the scaling changes in a different PR.

Yeah, let's do it in a separate PR, also because of the MLDA complaining here

ricardoV94 · 2022-05-31T10:00:18Z

I am going to temporarily revert the last commit to see if the tests pass. The Metropolis related tests seem to be performing a bit poorer with the proposed alternative tune rule (or it could very well be just be a seeding thing)

ricardoV94 · 2022-05-31T11:03:04Z

pymc/tests/test_step.py

+            Dirichlet.dist(a=np.ones(3) * (np.arange(20) + 1)[:, None], shape=(2, 20, 3)),
+        ),
+    )
+    def test_elemwise_update(self, batched_dist):


I run these new tests 10 times each, with the current new algorithm and the one proposed by @aloctavodia that we use in SMC. Things look sensible in both cases:

10 runs each with different seeds

Tuning method based on switch statement:

1. (0D Bin) max_rhat(mean=1.005, std=0.005), min_ess=(mean=327.4, std=29.9) 2. (1D Bin) max_rhat(mean=1.031, std=0.007), min_ess=(mean=115.5, std=20.9) 3. (2D Bin) max_rhat(mean=1.021, std=0.005), min_ess=(mean=180.0, std=22.5) 4. (2D Dir) max_rhat(mean=1.041, std=0.010), min_ess=(mean=91.2, std=24.7) 5. (3D Dir) max_rhat(mean=1.037, std=0.008), min_ess=(mean=110.4, std=24.4)

Tuning method based on distance from 0.234:

1. (0D Bin) max_rhat(mean=1.009, std=0.008), min_ess=(mean=231.3, std=45.6) # Worse 2. (1D Bin) max_rhat(mean=1.026, std=0.008), min_ess=(mean=206.9, std=21.5) # Better 3. (2D Bin) max_rhat(mean=1.026, std=0.005), min_ess=(mean=188.7, std=22.3) 4. (2D Dir) max_rhat(mean=1.043, std=0.012), min_ess=(mean=98.9, std=27.8) 5. (3D Dir) max_rhat(mean=1.039, std=0.009), min_ess=(mean=102.6, std=21.5)

But the MLDA complains a lot about the SMC-like tuning... specially this test fails even if I increase the number of draws: https://github.com/ricardoV94/pymc/blob/3a84db6b91035a7c7f68633bdbb18c5f11efd46f/pymc/tests/test_step.py#L1184

Maybe I am just misunderstanding what that test is supposed to check

Looking at the standard deviations I'm not that impressed thb.
(Considering that the code is now more complicated..)

Or do you have another application that motivated this?

@michaelosthege this is not the change from now vs before the PR, it's the change between this PR and an alternative procedure that @aloctavodia suggested during this PR.

Before this PR there were zero accepted transitions in all the new tests (except the first scalar case)

I updated the headings to make it obvious

Metropolis now updates each dimension sequentially and tunes a proposal scale parameter per dimension

ricardoV94 · 2022-05-31T13:53:41Z

@aloctavodia I removed the new tune strategy. I'll open a new PR for that. Do you care to review again?

michaelosthege · 2022-05-31T15:32:09Z

pymc/step_methods/metropolis.py

+        # is not safe for discrete multivariate distributions (looking at you Multinomial),
+        # due to high dependency among the support dimensions. For continuous multivariate
+        # distributions we assume they are being transformed in a way that makes each
+        # dimension semi-independent.


Suggested change

# dimension semi-independent.

# dimension semi-independent.

# Consequently we'd like to tune the scaling based on

# acceptance rate in each dimension independently.

michaelosthege · 2022-05-31T15:32:18Z

pymc/step_methods/metropolis.py

-        # remember initial settings before tuning so they can be reset
-        self._untuned_settings = dict(
-            scaling=self.scaling, steps_until_tune=tune_interval, accepted=self.accepted
+        # Metropolis will try to handle one batched dimension at a time This, however,


Suggested change

# Metropolis will try to handle one batched dimension at a time This, however,

# Metropolis will try to handle one batched dimension at a time. This, however,

michaelosthege · 2022-05-31T15:35:06Z

pymc/step_methods/metropolis.py

+        if self.elemwise_update:
+            dims = int(sum(np.prod(ivs) for ivs in initial_values_shape))


Should this be an internal attribute?

Could you come up with a more informative name?

Right now (reading the diff top to bottom) I'm confused because this smells like CompoundStep, but it's only about elementwise tuning, right?

See my comment below. CompoundStep has nothing to do with this.

michaelosthege · 2022-05-31T15:37:02Z

pymc/step_methods/metropolis.py

+            for i in self.enum_dims:
+                q_temp[i] = q[i]
+                accept_rate_i = self.delta_logp(q_temp, q0)
+                q_temp_, accepted_i = metrop_select(accept_rate_i, q_temp, q0)
+                q_temp[i] = q_temp_[i]
+                self.accept_rate_iter[i] = accept_rate_i
+                self.accepted_iter[i] = accepted_i
+                self.accepted_sum[i] += accepted_i
+            q = q_temp


Is this doing what we usually do with CompoundStep?

If not, maybe explain what the if/else blocks do in a code comment

CompoundStep assigns one variable per step, here we are sampling one dimension within a variable (or within multiple variables) semi-independently

michaelosthege · 2022-05-31T15:39:09Z

pymc/step_methods/metropolis.py

+                ),
+            ),
+        ),
+    )


Ugh, can't we do this in a for iteration with a break?

We can probably index, that should be faster for large arrays because all branches of np.where are evaluated by default

michaelosthege · 2022-05-31T15:43:08Z

pymc/tests/test_step.py

+            Dirichlet.dist(a=np.ones(3) * (np.arange(20) + 1)[:, None], shape=(2, 20, 3)),
+        ),
+    )
+    def test_elemwise_update(self, batched_dist):


Looking at the standard deviations I'm not that impressed thb.
(Considering that the code is now more complicated..)

Or do you have another application that motivated this?

michaelosthege · 2022-05-31T15:44:02Z

pymc/tests/test_step.py

+                # This used to be a scrict zero equality!
+                assert np.isclose(Q_1_0.mean(axis=1), 0.0, atol=1e-4)
+                assert np.isclose(Q_2_1.mean(axis=1), 0.0, atol=1e-4)


Nobody is actively maintaining it. Also, didn't we want to move it to pymc-experimental?

aloctavodia

LGTM!

ricardoV94 added the maintenance label May 30, 2022

ricardoV94 requested review from aloctavodia, ColCarroll and junpenglao May 30, 2022 13:22

ricardoV94 force-pushed the improve_multidim_metropolis branch from a7ab0e4 to cec9b5c Compare May 30, 2022 13:24

ricardoV94 requested review from twiecki and fonnesbeck May 30, 2022 13:36

aloctavodia approved these changes May 30, 2022

View reviewed changes

pymc/step_methods/metropolis.py Show resolved Hide resolved

ricardoV94 force-pushed the improve_multidim_metropolis branch from cec9b5c to 616cd90 Compare May 30, 2022 16:55

Move TestMetropolisProposal tests to TestMetropolis

f787a60

ricardoV94 force-pushed the improve_multidim_metropolis branch from 616cd90 to 53fa596 Compare May 31, 2022 07:24

ricardoV94 commented May 31, 2022

View reviewed changes

ricardoV94 requested a review from aloctavodia May 31, 2022 07:26

ricardoV94 force-pushed the improve_multidim_metropolis branch from 53fa596 to 0d8d49b Compare May 31, 2022 07:36

ricardoV94 marked this pull request as ready for review May 31, 2022 07:36

ricardoV94 marked this pull request as draft May 31, 2022 10:01

ricardoV94 commented May 31, 2022

View reviewed changes

Make Metropolis cope better with multiple dimensions

f006eda

Metropolis now updates each dimension sequentially and tunes a proposal scale parameter per dimension

ricardoV94 force-pushed the improve_multidim_metropolis branch from f27dc98 to f006eda Compare May 31, 2022 13:52

ricardoV94 marked this pull request as ready for review May 31, 2022 13:53

michaelosthege reviewed May 31, 2022

View reviewed changes

aloctavodia approved these changes May 31, 2022

View reviewed changes

twiecki merged commit 1e7d91f into pymc-devs:main May 31, 2022

ricardoV94 added the enhancements label Aug 31, 2022

ricardoV94 deleted the improve_multidim_metropolis branch June 6, 2023 03:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Metropolis cope better with multiple dimensions #5823

Make Metropolis cope better with multiple dimensions #5823

ricardoV94 commented May 30, 2022 •

edited

Loading

codecov bot commented May 30, 2022 •

edited

Loading

ricardoV94 May 31, 2022

michaelosthege May 31, 2022

ricardoV94 May 31, 2022

ricardoV94 May 31, 2022

aloctavodia May 31, 2022

ricardoV94 May 31, 2022

ricardoV94 commented May 31, 2022 •

edited

Loading

ricardoV94 May 31, 2022 •

edited

Loading

ricardoV94 May 31, 2022 •

edited

Loading

michaelosthege May 31, 2022

ricardoV94 May 31, 2022 •

edited

Loading

ricardoV94 commented May 31, 2022

michaelosthege May 31, 2022

michaelosthege May 31, 2022

michaelosthege May 31, 2022

ricardoV94 May 31, 2022

michaelosthege May 31, 2022

ricardoV94 May 31, 2022

michaelosthege May 31, 2022

ricardoV94 May 31, 2022

michaelosthege May 31, 2022

michaelosthege May 31, 2022

aloctavodia left a comment

	ndim = self.tempered_posterior.shape[1]
	self.proposal_scales = np.full(self.draws, min(1, 2.38**2 / ndim))

	# Metropolis will try to handle one batched dimension at a time This, however,
	# Metropolis will try to handle one batched dimension at a time. This, however,

		if self.elemwise_update:
		dims = int(sum(np.prod(ivs) for ivs in initial_values_shape))

Make Metropolis cope better with multiple dimensions #5823

Make Metropolis cope better with multiple dimensions #5823

Conversation

ricardoV94 commented May 30, 2022 • edited Loading

codecov bot commented May 30, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 commented May 31, 2022 • edited Loading

ricardoV94 May 31, 2022 • edited Loading

Choose a reason for hiding this comment

10 runs each with different seeds

Tuning method based on switch statement:

Tuning method based on distance from 0.234:

ricardoV94 May 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 May 31, 2022 • edited Loading

Choose a reason for hiding this comment

ricardoV94 commented May 31, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aloctavodia left a comment

Choose a reason for hiding this comment

ricardoV94 commented May 30, 2022 •

edited

Loading

codecov bot commented May 30, 2022 •

edited

Loading

ricardoV94 commented May 31, 2022 •

edited

Loading

ricardoV94 May 31, 2022 •

edited

Loading

ricardoV94 May 31, 2022 •

edited

Loading

ricardoV94 May 31, 2022 •

edited

Loading