-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diffusive Gibbs Sampler #744
Comments
Can you explain how this algorithm is different from the auxiliary perspective of MALA in https://rss.onlinelibrary.wiley.com/doi/full/10.1111/rssb.12269? Edit: Gaussian convolution and then Langevin "denoising" is exactly aMALA in my books, so where's the difference? |
Maybe given your comment I should underline that I have no personal interest in the paper. I was not familiar with the reference you provided but the method does look similar. At first glance aMALA doesn't seem to have the multilevel noise schedule of Diffusive Gibbs and initialization of the denoising step (eq. 14 of DiGS) seems to be different. Is the contraction idea and the corresponding variance of eq. 10 also in aMALA ? Had a look at your code implementation in marginal_latent_gaussian.py but wasn't obvious |
The marginal latent Gaussian is the counterpart for Gaussian priors, but I really think it's related: the auxiliary target is the same one, and then it looks like it's just a bunch of MALA steps with increasing step size -> this is not unprecedented in literature (although people typically take step sizes coming from Chebyshev polynomials, for which there is theory). I am not against having underperforming or academically not super novel samplers in the library mind you, I'm mostly thinking we may want to think carefully about the components and implement these, rather than the special instance that DiGS offers. |
I don't see how this is true? The schedule modifies the proximal version of the score function (eq. 12) and not directly the step size. Putting academic novelty aside or which paper gets credited, I think this is a simple yet effective sampler. |
Ok yes I get you. It wasn't clear to me which part of the alg. you were referring to.
I agree but the idea of dilation/contraction of the space with Is there already something in place for dealing with auxiliary variable samplers? On top of what you propose I think this could be a nice add |
No there's not yet I don't think but it's a relatively good and easy
addition though we may want to be a bit careful about the design.
So, my suggestion today is to probably implement the algo in the sampling
books by using components from the main library and see if you are blocked
by anything. If you are, then it means we need to refactor. If you are not,
let's see how DiGS picks up to put it in the core library?
@junpenglao, any comment on that?
…On Wed, 2 Oct 2024, 18:59 Jacopo Iollo, ***@***.***> wrote:
Ok yes I get you. It wasn't clear to me which part of the alg. you were
referring to.
Also I'm really not sure the choice of balancing they have is the best one.
I agree but the idea of dilation/contraction of the space with $\alpha$
is interesting especially in a multimodal setting.
Is there already something in place for dealing with auxiliary variable
samplers? On top of what you propose I think this could be a nice add
—
Reply to this email directly, view it on GitHub
<#744 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYGFZYPH5FWYHX5U4TZEGDZZQYAPAVCNFSM6AAAAABPGOZ5JKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBZGI4TIMRYGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Presentation of the new sampler
DiGS is an auxiliary variable MCMC method where the auxiliary variable$\tilde{x}$ is a noisy version of the original variable $x$ . DiGS enhances mixing and helps escape local modes by alternately sampling from the distributions $p(\tilde{x}|x)$ , which introduces noise via Gaussian convolution, and $p(x|\tilde{x})$ , which denoises the sample back to the original space using a score-based update (eg a Langevin diffusion).
https://arxiv.org/abs/2402.03008
I had very good results with it in small to medium dimensions. It really helps escaping local modes. A very powerful usage is to use it as a proposal in an SMC based procedure where it helps moving samples back in depleted zones. In high dimension the acceptance ratio of the MH step becomes the tricky part.
If you think it is a sensible addition to Blackjax I'll be happy to contribute.
How does it compare to other algorithms in blackjax?
The number of denoising steps is flexible so it can be computationally efficient. The algorithm is quite simple conceptually but is applicable to a wide class of problems.
Where does it fit in blackjax
As an MCMC kernel per se or as an SMC proposal.
Are you willing to open a PR?
Yes - I have a version that I used for my research and would be happy to contribute it to Blackjax
The text was updated successfully, but these errors were encountered: