-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change SMC metropolis kernel to independent metropolis kernel #4115
Conversation
pymc3/smc/smc.py
Outdated
@@ -341,6 +337,7 @@ def __init__( | |||
self.observations = self.sum_stat(observations) | |||
|
|||
def posterior_to_function(self, posterior): | |||
model = self.model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, why is this added here if it's unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This is a mistake, this is part of other ideas I was trying that did not make it into the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aloctavodia while we're here, just wanted to say that I'm studying your book and that it's really good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad you like it. @MarcoGorelli. Hopefully, next year we will publish a new one together with Junpeng, Ravin and Ari
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome @aloctavodia - feel free to reach out to me if you want any help proof-reading
Codecov Report
@@ Coverage Diff @@
## master #4115 +/- ##
=======================================
Coverage 88.74% 88.75%
=======================================
Files 89 89
Lines 14040 14037 -3
=======================================
- Hits 12460 12458 -2
+ Misses 1580 1579 -1
|
pymc3/smc/smc.py
Outdated
@@ -181,7 +182,6 @@ def resample(self): | |||
self.likelihood_logp = self.likelihood_logp[resampling_indexes] | |||
self.posterior_logp = self.prior_logp + self.likelihood_logp * self.beta | |||
self.acc_per_chain = self.acc_per_chain[resampling_indexes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove.
pymc3/smc/smc.py
Outdated
@@ -64,8 +65,6 @@ def __init__( | |||
self.acc_rate = 1 | |||
self.acc_per_chain = np.ones(self.draws) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
pymc3/smc/smc.py
Outdated
dist = multivariate_normal(self.posterior.mean(axis=0), self.cov) | ||
|
||
for n_step in range(self.n_steps): | ||
proposal = floatX(self.posterior + proposals[n_step]) | ||
proposal = floatX(dist.rvs(size=self.draws)) | ||
proposal = proposal.reshape(len(proposal), -1) | ||
forward = dist.logpdf(proposal) | ||
backward = multivariate_normal(proposal.mean(axis=0), self.cov).logpdf(self.posterior) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to understand a bit better what the changes are doing here, so my summary is that:
1, it removes the tuning related to self.scalings
, which originally is to tune a scaling of the proposal so that each chain approach a target acceptance rate 0.234
;
2, the new transition kernel use global information from the current posterior mu = self.posterior.mean(axis=0)
to build a MVNormal, and to account for the transition
(notation from Wikipedia).
For that, the "backward"
use a new MvNormal with mu = mean(x_prime)
.
It is independent MH as we have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - could you add some of these as inline comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why dist = multivariate_normal(self.posterior.mean(axis=0), self.cov)
is not in the for n_step ...
loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively I thought about fixing the proposal before doing a productive run. Also because this will make easier to use more complex proposal distributions in the future, as you compute the mean(s) and covariance(s) one per stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please some comments on the changes and also add a line to the release note.
LGTM overall, although it is not obvious to me that why this will work better. |
I think this is just offering a general better proposal distribution as it is better learning the geometry of the posterior. With the previous proposal every point close to the borders will generate a lot of rejections (and it will push the scaling to lower values), and I guess as you move to higher dimensions the chances you are at a border increase. I still think a better proposal could be learned from the previous stage (suggestions are welcome!), not sure at this point which option is the best, a Gaussian of mixture could help, maybe even one with just a few components. I was mostly checking 3 things the trace (with kind="rank_vlines"), the ESS, R-hat. For example with the previous sampler for dimensions larger than ~15 you start to see some problems, now it will still work until dimension ~60. I additionally check the time, and is a little bit slower but not that much, considering you know get reasonable answers :-) |
Wrote an implementation in TFP https://colab.research.google.com/gist/junpenglao/637647b975e3616f5bd362a07859d1d8/smc-demo.ipynb, seems in general better (except mixture distribution with mode far away, which also make sense) |
@@ -35,7 +35,7 @@ def sample_smc( | |||
n_steps=25, | |||
start=None, | |||
tune_steps=True, | |||
p_acc_rate=0.99, | |||
p_acc_rate=0.85, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized my simulation use .99
still and it seems better than .85
- how confidence are you with this @aloctavodia?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure at all. This is something to test and most likely change. Eyeballing 0.9 could be a good compromise. But for you chains/steps are cheaper,right? So maybe you can keep 0.99.
This seems to provide an equal or better sampling for all models I run test (including ABC). The main advantages are for models with more than ~20 dimensions and hierarchical models.