Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEMetropolis: tune lambda instead of epsilon #3720

Closed
michaelosthege opened this issue Dec 10, 2019 · 6 comments · Fixed by #3743
Closed

DEMetropolis: tune lambda instead of epsilon #3720

michaelosthege opened this issue Dec 10, 2019 · 6 comments · Fixed by #3743

Comments

@michaelosthege
Copy link
Member

Our DEMetropolis tunes the scaling factor of the noise distribution:
https://github.com/pymc-devs/pymc3/blob/1c30a6f487afaeef73464a98320e35961b11873f/pymc3/step_methods/metropolis.py#L572-L581

My feeling these days is that tuning the noise distribution is a bit pointless after the first few iterations & could obscure the warmup, or even lead to slingshots if it overshoots.

Instead, we could tune lambda parameter. It's optimal value depends on the (dimensionality of the) target density (ter Braak (2006)), so it should be a good candidate for tuning.
This approach is described in Nelson et al. (2013), section 4.1.2.

michaelosthege added a commit to michaelosthege/pymc that referenced this issue Dec 17, 2019
@michaelosthege
Copy link
Member Author

michaelosthege commented Dec 17, 2019

On my test problem (50-dim MvNormal) the tuning converges to the same rule of thumb that was used before (2.38 / sqrt(2*ndim)).
Here the progression of the tuned parameter (3000 tuning its)
image

I also see no significant difference in the effective sample size...

I'm thinking to ditch tuning of scaling/lambda alltogether. What do you think?
cc @junpenglao

@junpenglao
Copy link
Member

Could you try on a ODE example?

@michaelosthege
Copy link
Member Author

Could you try on a ODE example?

I tried with Demetris benchmark example, but it was very slow & inefficient (--> noisy) while having just 2 dimensions.

@junpenglao
Copy link
Member

Do you mean slower than no tuning?

@michaelosthege
Copy link
Member Author

Do you mean slower than no tuning?

No the sampling was just slow/inefficient because it's an ODE. Also DifferentialEquation computes sensitivities that DEMetropolis doesn't use.
So I'd have to wait very long for the benchmark results to give a significant answer.

I could also implement a kwarg like DEMetropolis(tune_par=x) with x in {None, 'epsilon', 'lambda'} where None is the default because tuning epsilon is a bit pointless & tuning lambda is not necessarily better than lambda = 2.38 / sqrt(2*ndim).

Maybe the result of my testing is simply that DEMetropolis doesn't need hyperparameter tuning. (Needs warmup/burnin though.)

@junpenglao
Copy link
Member

I see. Thanks! Feel free to close.

michaelosthege added a commit to michaelosthege/pymc that referenced this issue Dec 19, 2019
+ tune argument now one of None,scaling,lambda
+ support for tuning lambda (closes pymc-devs#3720)
+ added test to check checking of tune setting
+ both scaling and lambda are recorded in the sampler stats
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants