Add ArviZ integration #542

sethaxen · 2021-09-22T09:25:00Z

As suggested by @Meteore, the ArviZ package has a large number of MCMC diagnostics, statistics, and visualizations. See for example the gallery. It provides diagnostics/plots to PyMC3 but is PPL-agnostic.

It would be good to include here a converter to an arviz.InferenceData, which would automatically allow users to apply these diagnostics.

The text was updated successfully, but these errors were encountered:

sethaxen · 2021-09-22T09:29:54Z

Given that ArviZ already has a converter from Pyro, this would probably be really easy to do: https://arviz-devs.github.io/arviz/api/generated/arviz.from_pyro.html

michaeldeistler · 2021-09-22T09:31:33Z

I agree, it would be great to support ArviZ

sethaxen · 2021-09-22T09:32:50Z

Okay! I'm happy to open a PR.

michaeldeistler · 2021-09-22T09:35:20Z

Great, thank you!

sethaxen · 2021-09-22T11:55:34Z

LikelihoodBasedPosterior.sample just returns the samples without any of the sample statistics (e.g. divergences and log probability). Several questions:

Can MCMC be run with multiple chains in parallel?
Is it possible to have the fitted pyro MCMC object returned to the user? This would give the user access to the sampling statistics (and potentially allow them to resume sampling, but I don't know if this is a Pyro-supported feature).
If so, does sbi handle transformations of constrained parameters to an unconstrained space itself, or does it rely on pyro to do that? The latter would be in a sense more convenient, because then IIUC pyro's MCMC object would already return draws in the constrained space.

michaeldeistler · 2021-09-22T12:04:49Z

Yes. mcmc_parameters={"num_chains": 10} will run multiple chains. If you use slice sampling (based on np), the chains can also use vectorization with mcmc_parameters={"num_chains": 10, mcmc_method="slice_np_vectorized"}
That would in principle be possible. I would prefer not returning it by default, just to avoid such a major API change (so maybe let's add a flag return_sampler: bool = False to the .sample() method?) Alternatively, we could also save the sampler as an attribute of the posterior class (i.e. self.sampler(...) here).
Unfortunately, pyro does not handle these transformations itself when a potential_fn() is used, it can only infer the transform if a model() is provided. At least I could not figure out how one would do it with a potential_fn(). We do the tranforms ourselves here. I had created an issue on pyro about this here.

Hope this helps!

sethaxen · 2021-09-22T12:13:47Z

1. Yes. `mcmc_parameters={"num_chains": 10}` will run multiple chains.

perfect!

2. That would in principle be possible. I would prefer not returning it by default, just to avoid such a major API change (so maybe let's add a flag `return_sampler: bool = False` to the `.sample()` method?) Alternatively, we could also save the sampler as an attribute of the posterior class (i.e. `self.sampler(...)` [here](https://github.com/mackelab/sbi/blob/65b9873d2cab0a1954c0203833a99f8140dbba99/sbi/inference/posteriors/base_posterior.py#L612)).

I agree, the current API should be kept. The latter option is nice (storing the MCMC object as self.sampler). Pyro's MCMC also takes a thinning parameter https://num.pyro.ai/en/stable/mcmc.html#numpyro.infer.mcmc.MCMC. Is there a reason sbi does the thinning itself?

3. Unfortunately, pyro does not handle these transformations itself when a `potential_fn()` is used, it can only infer the transform if a `model()` is provided. At least I could not figure out how one would do it with a `potential_fn()`. We do the tranforms ourselves [here](https://github.com/mackelab/sbi/blob/65b9873d2cab0a1954c0203833a99f8140dbba99/sbi/inference/posteriors/likelihood_based_posterior.py#L187).

Hm, that is unfortunate. I need to look more carefully at pyro.

Hope this helps!

yes, very helpful! Thanks! I'm learning both sbi and pyro at the same time, so there will likely be more questions.

michaeldeistler · 2021-09-22T12:17:00Z

Nope, there's no reason to not use pyro's thinning here. Feel free to use it.
Yeah, I still think this is a bug but I don't know pyro well either (which is why I stopped bothering them about this ;) ).

sethaxen · 2021-09-22T12:36:43Z

Yeah, I still think this is a bug but I don't know pyro well either (which is why I stopped bothering them about this ;) ).

Perhaps this could be worked around with a helper function for construction a Distribution from the LikelihoodBasePosterior object. Then sbi would define a Pyro model instead of using potential_fn(), passing the transform defined by sbi.

A fringe benefit of this is that in principle one could use the fitted posterior as a prior in a Pyro model for Bayesian updating.

michaeldeistler · 2021-09-22T12:47:30Z

Yes I think this could have worked. We added the transforms way after relying on pyro's potential_fn so there's definitely some technical debt here. Your suggestion sounds very reasonable, we might want to do this in the future

sethaxen · 2021-09-22T13:00:09Z

Alright, in the interest of picking the low-hanging fruit first, my proposal is to:

pass the thinning keyword to MCMC
make sampler a stored field of LikelihoodBasedPosterior (at least, maybe it makes sense for other classes to have this)

That should be sufficient for allowing users to use ArviZ to diagnose model problems in the unconstrained space. Then later we could potentially do something like #542 (comment) so that the users get the sampler in the constrained space.

michaeldeistler · 2021-09-23T09:47:53Z

Sounds good to me. Regarding 2:
If you make in an attribute here, i.e. in the BasePosterior, then all methods will inherit the attribute (which i think is desirable). I'd also set the numpy based samplers as attribute. I.e. here rename the posterior_sampler to self.sampler

sethaxen · 2021-09-24T10:09:12Z

I'd also set the numpy based samplers as attribute. I.e. here rename the posterior_sampler to self.sampler

What are your thoughts on this particular case, where there's not just one sampler but a vector of samplers (one per chain)? I see several ways of handling this:

have self.sampler return either a sampler or vector of samplers (this case)
have self.samplers instead, which would return a vector of length one for all samplers except this one (downside: length of vector is always one or number of chains in only this case)
introduce something like SliceSamplerSerial that has the same interface as SliceSamplerVectorized but internally loops over the chains and calls SliceSampler. Then this code would use SliceSamplerSerial.

To me (3) seems cleanest. What do you think?

michaeldeistler · 2021-09-24T14:12:16Z

Just to make sure that I understand proposition 3 correctly:

you would move this loop into the new class SliceSamplerVectorized, right? I like the idea because it would simplify this entire if-else-case.

sethaxen · 2021-09-24T14:23:22Z

you would move this loop into the new class SliceSamplerVectorized, right? I like the idea because it would simplify this entire if-else-case.

SliceSamplerVectorized already exists, and it would be misleading for a so-named class for it to potentially loop internally. My thought was to add a SliceSamplerSerial that would behave similarly to SliceSamplerVectorized but would instead have a loop. Maybe one easy way to do this would be to add something like SliceSamplerMultiChainBase that implements any methods that would be exactly shared by the two classes and have both SliceSamplerSerial and SliceSamplerVectorized subclass this.

Alternatively, one could have a SliceSamplerMultiChain that has the linked if-else statement and deprecate SliceSamplerVectorized, but if the latter class is part of the API, this is not ideal.

michaeldeistler · 2021-09-27T07:13:58Z

Sorryyyy I meant SliceSamplerSerial, not SliceSamplerVectorized. So yeah, I completely agree with your original suggestion

alvorithm added the enhancement New feature or request label Sep 22, 2021

sethaxen mentioned this issue Oct 4, 2021

Add features to support ArviZ integration #546

Closed

michaeldeistler mentioned this issue Dec 9, 2021

Log MCMC diagnostics #541

Closed

janfb added the hackathon label Dec 9, 2021

janfb mentioned this issue Jan 25, 2022

Add features to support ArviZ integration, Rebased #607

Merged

6 tasks

janfb mentioned this issue Jun 26, 2022

Importance sampler #692

Merged

janfb closed this as completed in #607 Aug 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ArviZ integration #542

Add ArviZ integration #542

sethaxen commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 23, 2021

sethaxen commented Sep 24, 2021

michaeldeistler commented Sep 24, 2021 •

edited

Loading

sethaxen commented Sep 24, 2021

michaeldeistler commented Sep 27, 2021 •

edited

Loading

Add ArviZ integration #542

Add ArviZ integration #542

Comments

sethaxen commented Sep 22, 2021 • edited Loading

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 22, 2021 • edited Loading

sethaxen commented Sep 22, 2021 • edited Loading

michaeldeistler commented Sep 22, 2021 • edited Loading

sethaxen commented Sep 22, 2021 • edited Loading

michaeldeistler commented Sep 22, 2021 • edited Loading

sethaxen commented Sep 22, 2021

michaeldeistler commented Sep 23, 2021

sethaxen commented Sep 24, 2021

michaeldeistler commented Sep 24, 2021 • edited Loading

sethaxen commented Sep 24, 2021

michaeldeistler commented Sep 27, 2021 • edited Loading

sethaxen commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 22, 2021 •

edited

Loading

sethaxen commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 22, 2021 •

edited

Loading

michaeldeistler commented Sep 24, 2021 •

edited

Loading

michaeldeistler commented Sep 27, 2021 •

edited

Loading