Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in sampling Binomial with non-scalar shape #4223

Closed
bridgeland opened this issue Nov 11, 2020 · 2 comments
Closed

Bug in sampling Binomial with non-scalar shape #4223

bridgeland opened this issue Nov 11, 2020 · 2 comments

Comments

@bridgeland
Copy link

When a Binomial RV without observations has a vector shape, it samples incorrectly, sampling mostly zeros no matter the values of n and p. It works fine for scalar shapes.

Example:

import pymc3 as pm 
import numpy as np

with pm.Model() as model:
    pm.Binomial('scalar_binomial', n=1, p=0.25)
    pm.Binomial('arrayed_binomial', n=1, p=0.25, shape=10)
    pm.Binomial(
        'arrayed_binomial_alt', n=np.ones(10), 
        p=np.full(10, 0.25), shape=10)
    
    trace = pm.sample(draws=600, chains=2, tune=500)
    summary = pm.summary(
        trace, 
        var_names=[
            'scalar_binomial', 'arrayed_binomial', 
            'arrayed_binomial_alt'])
summary 

The three RVs are identical binomials, except that the first RV scalar_binomial is a scalar, the second RV arrayed_binomial has shape 10 with scalar arguments, and the third RV arrayed_binomial_alt has shape 10 with (identical) arrayed arguments. Note also that nothing in this model is constrained by observations.

Sampling issues warnings about all-NaN slices:

image

The results are peculiar:

image

scalar_binomial is as expected. arrayed_binomial sees no non-zero samples in this execution, and means of 0.0. Other executions do see some non-zero samples. arrayed_binomial_alt sees no non-zero binomial samples, in this execution or in any others. Everything is nothing.

The same model structure with normals instead of binomials produces expected results:

import pymc3 as pm 
import numpy as np

with pm.Model() as model:
    pm.Normal('scalar_normal', mu=1, sigma=0.25)
    pm.Normal('arrayed_normal', mu=1, sigma=0.25, shape=10)
    pm.Normal(
        'arrayed_normal_alt', mu=np.ones(10), 
        sigma=np.full(10, 0.25), shape=10)
    
    trace = pm.sample(
        draws=600, chains=2, tune=500, 
        step=pm.Metropolis())      # Metropolis so the sampling is identical to above
    summary = pm.summary(
        trace, 
        var_names=[
            'scalar_normal', 'arrayed_normal', 'arrayed_normal_alt'])
summary

Normal results:

image

Versions and main components

  • PyMC3 Version: 3.9.3
  • Theano Version: 1.0.5
  • Python Version: 3.8.5
  • Operating system: macOS 10.15.7
  • How did you install PyMC3: conda
@ckrapu
Copy link
Contributor

ckrapu commented Nov 18, 2020

Perhaps somewhat surprisingly, this is not a bug and is expected behavior for MCMC using the Metropolis proposal. The short version of this explanation is that all the proposals for the vector version are getting rejected.

The longer explanation is this: for arrayed_binomial has its starting value in the distribution's mode. Hence, any proposal which flips any of the zeros to ones will lead to a lower likelihood and thus is relatively unlikely. I suspect that you may need 10k+ Metropolis proposals to get a few successful samples here. If you change the probability from 0.25 to 0.5 you will see that new samples will be drawn successfully and there will be meaningful sampler statistics. This effect is most pronounced when the proposal is multidimensional. Hence, the scalar case is unaffected by this issue.

The reason that the normal model provides different results is because it is a continuous model and hence we can use NUTS. NUTS avoids the random walk behavior (i.e. randomly flipping 1/0s) of the vanilla Metropolis algorithm that was used in the binomial case.

@bridgeland
Copy link
Author

Thanks. Understood.

This shortcoming of Metropolis suggests that sample_posterior_predictive() (and also fast_sample_posterior_predictive()) cannot be used with models that have arrays of binomials.

A nit: the normal model provides different results despite also using Metropolis; see comment in the original model above. I suspect normal works because as a continuous distribution there are many ways for the samples slightly smaller likelihood and hence still likely to be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants