ValueError: Non-parent parameter not found in posterior #750

amoulin55 · 2023-11-08T12:42:48Z

Hi - I am trying to use a Truncated Normal as my likelihood, see simple example below.

My issue is that the model.predict(trace, kind="pps") returns an error:

ValueError: Non-parent parameter not found in posterior.This error shouldn't have happened!

Is it something to do with the 'lower' parameter set as a constant? Thank you for your help.
Armelle

likelihood=bmb.Likelihood("TruncatedNormal", params=["mu", "sigma", "lower"], parent="mu")
links={"mu": "identity"}
zip_family = bmb.Family("zip", likelihood, links)
priors = {"sigma": bmb.Prior("HalfStudentT", nu=4, sigma=4.48), "lower": bmb.Prior("ConstantData", value=0.18)}

def glm_mcmc_inference_custom(df,family, priors):
    model = bmb.Model("y ~ x", df, family=family, priors=priors)

    trace = model.fit(
        draws=1000, 
        tune=500,   
        discard_tuned_samples=True,
        chains=4, 
        progressbar=True,
        idata_kwargs={"log_likelihood": True})
    return trace, model

tomicapretto · 2023-11-08T13:14:23Z

Hi @amoulin55, thanks for opening the issue!

Do you have a reproducible example? I just need

A dataset I can use to recreate the problem
The version of Bambi and PyMC that you're using.

Thanks!

amoulin55 · 2023-11-09T08:22:09Z

Hi, thank you for the quick follow-up. Below are some details:

Dataset, I unfortunately cannot provide it, but this pet-example shows the same error:

import numpy as np
import pandas as pd
import bambi as bmb
import pymc as pm
import arviz as az

n_obs = 1000
x_obs = np.random.uniform(size=(n_obs))
y_mean = x_obs -0.5
y_distr = pm.TruncatedNormal.dist(mu=y_mean, sigma=.03, lower=0)
y_obs = pm.draw(y_distr)
df_model=pd.DataFrame({"x": x_obs, "y": y_obs})

Versions: bambi 0.13.0 pymc 5.9.1

GStechschulte · 2023-11-09T19:24:53Z

Hey @amoulin55, thanks for the simulated data and code for the model. I am able to reproduce the error (although I had to change the value of the pm.ConstantData prior to 0.0 in order to get the model to sample. I am still working through where the bug is coming from.

tomicapretto · 2023-11-10T12:23:25Z

What's happening is that you're passing data as a prior. It results in Bambi looking for some value in the inference data object, when it shouldn't be looked there. You can write it as a custom family with a custom likelihood. Feels a little bit hacky but it's supported.

class CustomTruncatedNormal(pm.TruncatedNormal):
    def __call__(self, *args, **kwargs):
        # Set whatever value you want for `lower=0`
        return pm.TruncatedNormal(*args, **kwargs, lower=0)

    @classmethod
    def dist(self, *args, **kwargs):
        return pm.TruncatedNormal.dist(*args, **kwargs, lower=0)

likelihood = bmb.Likelihood(
    "TruncatedNormal", params=["mu", "sigma"], parent="mu", dist=CustomTruncatedNormal
)
links = {"mu": "identity"}
family = bmb.Family("truncated_normal", likelihood, links)
priors = {"sigma": bmb.Prior("HalfStudentT", nu=4, sigma=4.48)}

model = bmb.Model("y ~ x", df_model, family=family, priors=priors)
trace = model.fit()
model.predict(trace, kind="pps")

tomicapretto · 2023-11-11T14:11:09Z

In addition, since #752, we don't need custom families anymore for truncated models. In you case, simply do

bmb.Model("truncated(y, 0.18) ~ x", ...)

and that would be it.

Have a look at the example in the PR if you want to learn more.

amoulin55 · 2023-11-11T18:16:14Z

Thank you both, very much appreciated. I will try it out on my 'real' data and confirm.

amoulin55 · 2023-11-11T20:04:15Z

Hi - I tried the simple bmb.Model("truncated(y, 0) ~ ......", ...) and it returned an error 'KeyError: 'truncated''

Then I tried to run your example from #752 , exactly as-is, and it returns the same error. I I am new in this space, so could very well miss something. I am still using bambi 0.13.0.

Thank you again for your guidance.

tomicapretto · 2023-11-11T21:59:01Z

You need to install Bambi from the main branch as I didn't make any release yet

pip install git+https://github.com/bambinos/bambi.git

amoulin55 · 2023-11-13T08:21:13Z

Many thanks for your guidance. I first played with your example with having only a lower bound at -5 (because my data distribution of interest has only a lower bound), see yt (observed) plotted below.

model = bmb.Model("truncated(y, -5) ~ x", df, priors=priors)
trace = model.fit(
draws=1000,
tune=500,
discard_tuned_samples=True,
chains=4,
progressbar=True,
idata_kwargs={"log_likelihood": True})

But it seems that the posterior predictive vs observed results are then flipped, see below.

tomicapretto · 2023-11-13T22:47:12Z

Why do you say the results are flipped? The observed is cut at "-5" because you never observe values below "-5" because of the truncation process. However, the model can still predict values below "-5". Am I missing something?

amoulin55 · 2023-11-15T22:25:13Z

Indeed, I did not think it through. Thanks again for the new feature. I will try truncated StudentT next.

amoulin55 · 2023-11-16T09:56:18Z

Following up as I am still having a couple of questions:

when I try a truncated t-student, I get the warning /opt/conda/lib/python3.10/site-packages/pytensor/tensor/rewriting/elemwise.py:700: UserWarning: Optimization Warning: The Op betainc does not provide a C implementation. As well as being potentially slow, this also disables loop fusion.
And it is very slow, estimated 8 hours and not done (versus a couple of minutes for a Gaussian)
This seems comparable to the pymc issue How to speed up model fitting of CustomDist? pymc-devs/pymc#6661
reflecting on your earlier comment: I thought the predictive posterior would sample from a truncated distribution, hence the predictive posterior mean would be truncated. This was the point of the exercise, to force a lower limit. But from your comment, there is a flaw somewhere in my thinking. And if you have the time, having an explanation from your expertise in Bayesian modelling would be very helpful. Or giving me a reference that I could learn from.

For your information, this is what I as posterior predictive mean distribution with my real data, comparing a plain gaussian with a truncated gaussian likelihood.
The regular gaussian predictive posterior mean fits well with the observed data, but I was trying to force positive values with a truncated likelihood.
The truncated gaussian one (lower=0, upper100) from your function just looks odd (by the way I had similar posterior y_mean results when I was trying to build it with pymc TruncatedNormal, so my guess is that it is not your function itself, but the concept).
HOWEVER, the ELPD shows the Tuncated having a better performance. Which adds to my confusion.

Thank you again.

zwelitunyiswa · 2023-11-16T12:36:46Z

I had a similar issue using the truncated response variable function. I was getting response values outside of the bounds that I had set. Maybe I am not understanding the concept either.

tomicapretto · 2023-11-16T20:25:03Z

The type of truncation I'm talking about here is the one mentioned in this example https://www.pymc.io/projects/examples/en/latest/generalized_linear_models/GLM-truncated-censored-regression.html

"Truncation is a type of missing data problem where you are simply unaware of any data where the outcome variable falls outside of a certain set of bounds.". This does not mean that values outside those bounds are not possible (i.e. the density function outside the bounds is zero). It means you can't observe those values, but those values of course can exist, so posterior predictive distribution is not bounded.

If you want the other kind of truncation, where you just restrict the boundary of a random variable, then that's another thing.

About the first point, you see the warning because there's some special function that doesn't have a C implementation thus PyTensor can't perform all the optimizations it would perform in other situations. It's not a bug, it's just that there's a missing feature.

zwelitunyiswa · 2023-11-17T00:51:34Z

“If you want the other kind of truncation, where you just restrict the boundary of a random variable, then that's another thing.” How would this be done?

…

On Thu, Nov 16, 2023 at 15:25 Tomás Capretto ***@***.***> wrote: The type of truncation I'm talking about here is the one mentioned in this example https://www.pymc.io/projects/examples/en/latest/generalized_linear_models/GLM-truncated-censored-regression.html "Truncation is a type of missing data problem where you are simply unaware of any data where the outcome variable falls outside of a certain set of bounds.". This does not mean that values outside those bounds are not possible (i.e. the density function outside the bounds is zero). It means you can't observe those values, but those values of course can exist, so posterior predictive distribution is not bounded. If you want the other kind of truncation, where you just restrict the boundary of a random variable, then that's another thing. About the first point, you see the warning because there's some special function that doesn't have a C implementation thus PyTensor can't perform all the optimizations it would perform in other situations. It's not a bug, it's just that there's a missing feature. — Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH3QQV73EFJZTT3LPRJJZ7LYEZZCTAVCNFSM6AAAAAA7C2DCF2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJVGI3DAMBQHA> . You are receiving this because you commented.Message ID: ***@***.***>

amoulin55 · 2023-11-17T07:39:34Z

Hi tomicapretto
Many thanks for your answer and I am sorry if I was unclear to start with about what I was after.

Similar ask as zwelitunyiswa, would you have some guidance on how we set a bound on y?

And thanks for the note on the missing feature.

DanielRobertNicoud · 2023-11-17T11:06:22Z

What's happening is that you're passing data as a prior. It results in Bambi looking for some value in the inference data object, when it shouldn't be looked there. You can write it as a custom family with a custom likelihood. Feels a little bit hacky but it's supported.

class CustomTruncatedNormal(pm.TruncatedNormal):
    def __call__(self, *args, **kwargs):
        # Set whatever value you want for `lower=0`
        return pm.TruncatedNormal(*args, **kwargs, lower=0)

    @classmethod
    def dist(self, *args, **kwargs):
        return pm.TruncatedNormal.dist(*args, **kwargs, lower=0)

likelihood = bmb.Likelihood(
    "TruncatedNormal", params=["mu", "sigma"], parent="mu", dist=CustomTruncatedNormal
)
links = {"mu": "identity"}
family = bmb.Family("truncated_normal", likelihood, links)
priors = {"sigma": bmb.Prior("HalfStudentT", nu=4, sigma=4.48)}

model = bmb.Model("y ~ x", df_model, family=family, priors=priors)
trace = model.fit()
model.predict(trace, kind="pps")

Hi @tomicapretto . Still, when fitting a model with a prior set to a constant value as @amoulin55 did in their OP it is unfortunate that the (constant valued) samples do not appear in the posterior, as it makes applying predict very clunky. Having the samples of the fixed hyperparameter appear would make applying the usual pipeline so much easier for very little effort. If we do not want to call it a bug, it's at least a quite nice feature that is missing. If you direct me to the appropriate scripts in the package managing this, I'll be happy to try to add it.

tomicapretto · 2023-11-17T12:09:29Z

@amoulin55 I'm double checking this because I think I got confused too. Will write again when I have an answer!

tomicapretto · 2023-11-18T13:24:14Z

@amoulin55 @zwelitunyiswa

I've been thinking about this and I've discussed it with others too. When we have a truncated distribution in PyMC we can get two kinds of predictions

The predictions of the underlying random variable without the truncation process. This is helpful if we want to draw conclusions about those values that we don't observe because of the missing data problem. And this is what Bambi is doing now.
The predictions of the truncated random variable. It gives a non-zero probability only to the values within the boundaries that we define. Bambi does not support this for now but I we will support it. I'm now thinking about what the interface should look like. I will post updates here.

zwelitunyiswa · 2023-11-18T15:53:09Z

@tomicapretto Thanks for spending time thinking about this. Your explanation makes complete sense. I, like @amoulin55 , thought that we were doing #2. My use case is healthcare research where I am trying to predict the area of a wound at certain body part. The prediction of the truncated random variable, cannot be bigger than the body part, e.g. foot.

zwelitunyiswa · 2023-11-18T15:59:57Z

Maybe we can have "constrained" function, and follow the truncated syntax so -> constrained(y, lb=2, ub=8).

DanielRobertNicoud · 2023-11-18T16:19:41Z

@tomicapretto @zwelitunyiswa @amoulin55 Great clarification, thanks! Imo a good solution would be to change the name of what is currently truncated to censored for #1 and use truncated for #2 (if this doesn't cause too many backwards compatibility issues).

tomicapretto · 2023-11-18T18:12:17Z

@DanielRobertNicoud I think censored is a pretty established term as it refers to the missing data mechanism, while truncation is a bit more ambiguous because it can refer to the missing data mechanism or the fact that you want a certain distribution with a truncated domain.

@zwelitunyiswa thanks for proposing constrained, I think it's a good alternative. I'll think a bit more about pros and cons.

tomicapretto · 2023-11-29T18:00:11Z

@zwelitunyiswa @DanielRobertNicoud we now have a constrained transformation, implemented in #764.

zwelitunyiswa · 2023-11-29T18:19:10Z

This is great! Amazing for my use case. Thank you. Question: what if I have constrained DVs at certain levels? So for example I have a model like: size ~ body_location, and body location has like 5 levels, would it be able to constrain by level? Z.

…

On Wed, Nov 29, 2023 at 1:00 PM Tomás Capretto ***@***.***> wrote: @zwelitunyiswa <https://github.com/zwelitunyiswa> @DanielRobertNicoud <https://github.com/DanielRobertNicoud> we now have a constrained transformation, implemented in #764 <#764>. — Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH3QQV2KOKONRY3KUXXL2ULYG5Z3PAVCNFSM6AAAAAA7C2DCF2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSGQZTQMZVGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

tomicapretto · 2023-12-02T16:16:09Z

@zwelitunyiswa

Question: what if I have constrained DVs at certain levels? So for example
I have a model like: size ~ body_location, and body location has like 5
levels, would it be able to constrain by level?

Do you mean that the bounds of the constrained distribution depend on the level of a categorical predictor? If that's what you mean, the answer is not for now as it only supports scalar values for lb and ub. The reason behind this approach is that it's tricky to decide where to take bounds from when generating predictions for new observations.

zwelitunyiswa · 2023-12-02T16:45:10Z

@tomicapretto Yup that's exactly right. Right now I can constrain at the max value possible level for all anatomical sites, who have varying max values. Thanks to you, that makes my model behave much better. Doing it at varying levels would be icing on the cake but totally get that it is not trivial and a very narrow use case.

GStechschulte added the bug label Nov 9, 2023

amoulin55 closed this as completed Nov 15, 2023

amoulin55 reopened this Nov 15, 2023

amoulin55 closed this as completed Nov 15, 2023

amoulin55 reopened this Nov 16, 2023

GStechschulte closed this as completed Nov 16, 2023

GStechschulte reopened this Nov 16, 2023

tomicapretto mentioned this issue Nov 29, 2023

Add constrained transform #764

Merged

tomicapretto closed this as completed in #764 Dec 2, 2023

ivanistheone mentioned this issue Oct 18, 2024

Non-parent parameter not found in posterior when using fixed Data prior #850

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Non-parent parameter not found in posterior #750

ValueError: Non-parent parameter not found in posterior #750

amoulin55 commented Nov 8, 2023 •

edited by GStechschulte

Loading

tomicapretto commented Nov 8, 2023

amoulin55 commented Nov 9, 2023 •

edited by GStechschulte

Loading

GStechschulte commented Nov 9, 2023

tomicapretto commented Nov 10, 2023

tomicapretto commented Nov 11, 2023

amoulin55 commented Nov 11, 2023

amoulin55 commented Nov 11, 2023

tomicapretto commented Nov 11, 2023

amoulin55 commented Nov 13, 2023

tomicapretto commented Nov 13, 2023

amoulin55 commented Nov 15, 2023

amoulin55 commented Nov 16, 2023

zwelitunyiswa commented Nov 16, 2023

tomicapretto commented Nov 16, 2023

zwelitunyiswa commented Nov 17, 2023 via email

amoulin55 commented Nov 17, 2023

DanielRobertNicoud commented Nov 17, 2023

tomicapretto commented Nov 17, 2023

tomicapretto commented Nov 18, 2023

zwelitunyiswa commented Nov 18, 2023

zwelitunyiswa commented Nov 18, 2023 •

edited

Loading

DanielRobertNicoud commented Nov 18, 2023

tomicapretto commented Nov 18, 2023

tomicapretto commented Nov 29, 2023

zwelitunyiswa commented Nov 29, 2023 via email

tomicapretto commented Dec 2, 2023

zwelitunyiswa commented Dec 2, 2023

ValueError: Non-parent parameter not found in posterior #750

ValueError: Non-parent parameter not found in posterior #750

Comments

amoulin55 commented Nov 8, 2023 • edited by GStechschulte Loading

tomicapretto commented Nov 8, 2023

amoulin55 commented Nov 9, 2023 • edited by GStechschulte Loading

GStechschulte commented Nov 9, 2023

tomicapretto commented Nov 10, 2023

tomicapretto commented Nov 11, 2023

amoulin55 commented Nov 11, 2023

amoulin55 commented Nov 11, 2023

tomicapretto commented Nov 11, 2023

amoulin55 commented Nov 13, 2023

tomicapretto commented Nov 13, 2023

amoulin55 commented Nov 15, 2023

amoulin55 commented Nov 16, 2023

zwelitunyiswa commented Nov 16, 2023

tomicapretto commented Nov 16, 2023

zwelitunyiswa commented Nov 17, 2023 via email

amoulin55 commented Nov 17, 2023

DanielRobertNicoud commented Nov 17, 2023

tomicapretto commented Nov 17, 2023

tomicapretto commented Nov 18, 2023

zwelitunyiswa commented Nov 18, 2023

zwelitunyiswa commented Nov 18, 2023 • edited Loading

DanielRobertNicoud commented Nov 18, 2023

tomicapretto commented Nov 18, 2023

tomicapretto commented Nov 29, 2023

zwelitunyiswa commented Nov 29, 2023 via email

tomicapretto commented Dec 2, 2023

zwelitunyiswa commented Dec 2, 2023

amoulin55 commented Nov 8, 2023 •

edited by GStechschulte

Loading

amoulin55 commented Nov 9, 2023 •

edited by GStechschulte

Loading

zwelitunyiswa commented Nov 18, 2023 •

edited

Loading