Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-parent parameter not found in posterior when using fixed Data prior #850

Closed
ivanistheone opened this issue Oct 18, 2024 · 2 comments · Fixed by #851
Closed

Non-parent parameter not found in posterior when using fixed Data prior #850

ivanistheone opened this issue Oct 18, 2024 · 2 comments · Fixed by #851
Labels

Comments

@ivanistheone
Copy link

Hi all. I ran into an issue similar to #750 where a variable required for posterior predictive of the response variable is not included in the inference data object.

I'm trying to fit a Gaussian model with known, fixed variance sigma=15, and custom prior norm(100,40) on the mean. This is for educational purposes, to show the simplest possible model. I found a way to add sigma as constant, by setting a bmb.Prior("Data", value=15), and the complete code example is like this:

# toy dataset
import pandas as pd
iqs = [ 82.6, 105.5,  96.7,  84.0, 127.2,  98.8,  94.3]
df = pd.DataFrame({"iq":iqs})


# Gaussian model with known variance sigma=15 and norm(100,40) prior on mean
import bambi as bmb
priors = {
    "Intercept": bmb.Prior("Normal", mu=100, sigma=40),
    "sigma": bmb.Prior("Data", value=15),
}
mod = bmb.Model("iq ~ 1",
                priors=priors,
                family="gaussian",
                link="identity",
                data=df)
mod
#      Formula: iq ~ 1
#       Family: gaussian
#         Link: mu = identity
# Observations: 7
#       Priors: 
#   target = mu
#       Common-level effects
#           Intercept ~ Normal(mu: 100.0, sigma: 40.0)
#       Auxiliary parameters
#           sigma ~ Data(value: 15.0)

idata = mod.fit()
# WORKS OK

Here sigma is not included in vars_to_sample, but the sigma info is preserved in idata under constant_data:

list(idata["constant_data"].keys())
# ['sigma']

If I then try to sample response variable I get this error:

mod.predict(idata, kind="response")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 mod.predict(idata, kind="response")

File [.../bambi/models.py:877], in Model.predict(self, idata, kind, data, inplace, include_group_specific, sample_new_groups)
    874 required_kwargs = {"model": self, "posterior": idata.posterior}
    875 optional_kwargs = {"data": data}
--> 877 posterior_predictive = self.family.posterior_predictive(
    878     **required_kwargs, **optional_kwargs
    879 )
    880 posterior_predictive = posterior_predictive.to_dataset(name=response_aliased_name)
    882 if "posterior_predictive" in idata:

File [...bambi/families/family.py#line=148), in Family.posterior_predictive(self, model, posterior, **kwargs)
    147 response_dist = get_response_dist(model.family)
    148 response_term = model.response_component.term
--> 149 kwargs, coords = self._make_dist_kwargs_and_coords(model, posterior, **kwargs)
    151 # Handle constrained responses
    152 if response_term.is_constrained:
    153     # Bounds are scalars, we can safely pick them from the first row

File [... bambi/families/family.py:256], in Family._make_dist_kwargs_and_coords(self, model, posterior, **kwargs)
    254         kwargs[param] = np.asarray(component.prior)
    255     else:
--> 256         raise ValueError(
    257             "Non-parent parameter not found in posterior."
    258             "This error shouldn't have happened!"
    259         )
    261 # Determine the array with largest number of dimensions
    262 ndims_max = max(x.ndim for x in kwargs.values())

ValueError: Non-parent parameter not found in posterior.This error shouldn't have happened!

Is there some way to make _make_dist_kwargs_and_coords look for sigma value in the constant_data?

Am-I doing something wrong/unexpected by setting the sigma prior using bmb.Prior("Data", value=15) ? I'd be happy to use another approach.

Oh and the context is pymc.__version__ == '5.17.0' and bmb.__version__ == '0.14.0' on macOS.

@tomicapretto
Copy link
Collaborator

@ivanistheone thanks for reporting the issue. There are two things going on here.

The first one, is that if you want to set a parameter to a constant value, you should simply use the constant value, not a Prior that calls pm.Data under the hood (although I have to say that was a good hack! I had not thought about it). Then, you should do

import pandas as pd
import bambi as bmb


iqs = [ 82.6, 105.5,  96.7,  84.0, 127.2,  98.8,  94.3]
df = pd.DataFrame({"iq":iqs})

priors = {
    "Intercept": bmb.Prior("Normal", mu=100, sigma=40),
    "sigma": 15,
}

mod = bmb.Model(
    "iq ~ 1",
    priors=priors,
    family="gaussian",
    link="identity",
    data=df
)

idata = mod.fit()
mod.predict(idata, kind="response")

However, this is still not working, but for a different reason. I'm fixing that right now. I'll update you when it's on main.

@ivanistheone
Copy link
Author

I can confirm the above code (with sigma as float) works now using the Bambi version on main.

Thanks for looking into and fixing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants