Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial attempt to add support for the Laplace family #524

Merged
merged 6 commits into from
Jun 8, 2022

Conversation

markgoodhead
Copy link
Contributor

This is to address #523

Note this is a WIP as I attempt to learn more of the bambi internals.

I'm not very familiar with the codebase but I attempted to copy the format of the Normal family and modify it accordingly. One thing I couldn't quite see is how this interacts with pymc - there's no "pm.Normal" calls which I was expecting to modify to the Laplace equivalent. Please let me know if there's something I'm missing and I'll attempt to fix this up.

mark added 2 commits June 7, 2022 11:46
@tomicapretto
Copy link
Collaborator

tomicapretto commented Jun 7, 2022

@markgoodhead thanks for opening this PR!

Initially this looks good! The only issue I would address is the name of the scale parameter in the Laplace distribution. In some places you're using b while in others you're using scale. From the docs I see the correct name is b.

Edit it would be good to have a short example in this PR to see this new family in action :)

Edit 2: From your commit messages I see you previously had "b" and then you used "scale" for consistency with NumPy. Here we have to be consistent with PyMC, because is the engine we pass distributions and parameters to.

bambi/defaults/defaults.py Outdated Show resolved Hide resolved
bambi/families/likelihood.py Outdated Show resolved Hide resolved
bambi/families/univariate.py Outdated Show resolved Hide resolved
@tomicapretto
Copy link
Collaborator

@markgoodhead for the missing pm.Normal call, you're right. Bambi implements a very general solution that avoids direct calls to pm.Normal, pm.Beta, etc. etc... It uses the name of the Likelihood function within the Family to select the appropiate PyMC distribution. Once that distribution is selected, it's a dist(...) call. All that happens in the following chunk

def build_response_distribution(self, kwargs):
# Get likelihood distribution
dist = get_distribution(self.family.likelihood.name)
# Handle some special cases
if isinstance(self.family, Beta):
# Beta distribution in PyMC uses alpha and beta, but we have mu and kappa.
# alpha = mu * kappa
# beta = (1 - mu) * kappa
alpha = kwargs["mu"] * kwargs["kappa"]
beta = (1 - kwargs["mu"]) * kwargs["kappa"]
return dist(self.name, alpha=alpha, beta=beta, observed=kwargs["observed"])
if isinstance(self.family, Binomial):
successes = kwargs["observed"][:, 0].squeeze()
trials = kwargs["observed"][:, 1].squeeze()
return dist(self.name, p=kwargs["p"], observed=successes, n=trials)
if isinstance(self.family, Gamma):
# Gamma distribution is specified using mu and sigma, but we request prior for alpha.
# We build sigma from mu and alpha.
sigma = kwargs["mu"] / (kwargs["alpha"] ** 0.5)
return dist(self.name, mu=kwargs["mu"], sigma=sigma, observed=kwargs["observed"])
if isinstance(self.family, Multinomial):
n = kwargs["observed"].sum(axis=1)
return dist(self.name, p=kwargs["p"], observed=kwargs["observed"], n=n)
return dist(self.name, **kwargs)

@markgoodhead
Copy link
Contributor Author

Thanks very much @tomicapretto ! I'll have a look at those changes now. I also spotted there's some interaction with the centered/non-centered functionality going on here for the Normal link case:

def build_distribution(self, dist, label, **kwargs):
"""Build and return a PyMC Distribution."""
dist = get_distribution(dist)
if "dims" in kwargs:
group_dim = [dim for dim in kwargs["dims"] if dim.endswith("__expr_dim")]
kwargs = {
k: self.expand_prior_args(k, v, label, dims=group_dim) for (k, v) in kwargs.items()
}
else:
kwargs = {k: self.expand_prior_args(k, v, label) for (k, v) in kwargs.items()}
if self.noncentered and has_hyperprior(kwargs):
sigma = kwargs["sigma"]
offset = pm.Normal(label + "_offset", mu=0, sigma=1, dims=kwargs["dims"])
return pm.Deterministic(label, offset * sigma, dims=kwargs["dims"])
return dist(label, **kwargs)

From my experiments before the Laplace link has similar trade-offs to the Normal in terms of when to use the centered/non-centered parameterisation so perhaps this would also need modifying to switch between Normal and Laplace to implement both the centered/non-centered versions of both?

@markgoodhead
Copy link
Contributor Author

@markgoodhead for the missing pm.Normal call, you're right. Bambi implements a very general solution that avoids direct calls to pm.Normal, pm.Beta, etc. etc... It uses the name of the Likelihood function within the Family to select the appropiate PyMC distribution. Once that distribution is selected, it's a dist(...) call. All that happens in the following chunk

def build_response_distribution(self, kwargs):
# Get likelihood distribution
dist = get_distribution(self.family.likelihood.name)
# Handle some special cases
if isinstance(self.family, Beta):
# Beta distribution in PyMC uses alpha and beta, but we have mu and kappa.
# alpha = mu * kappa
# beta = (1 - mu) * kappa
alpha = kwargs["mu"] * kwargs["kappa"]
beta = (1 - kwargs["mu"]) * kwargs["kappa"]
return dist(self.name, alpha=alpha, beta=beta, observed=kwargs["observed"])
if isinstance(self.family, Binomial):
successes = kwargs["observed"][:, 0].squeeze()
trials = kwargs["observed"][:, 1].squeeze()
return dist(self.name, p=kwargs["p"], observed=successes, n=trials)
if isinstance(self.family, Gamma):
# Gamma distribution is specified using mu and sigma, but we request prior for alpha.
# We build sigma from mu and alpha.
sigma = kwargs["mu"] / (kwargs["alpha"] ** 0.5)
return dist(self.name, mu=kwargs["mu"], sigma=sigma, observed=kwargs["observed"])
if isinstance(self.family, Multinomial):
n = kwargs["observed"].sum(axis=1)
return dist(self.name, p=kwargs["p"], observed=kwargs["observed"], n=n)
return dist(self.name, **kwargs)

On this point - it looks like Laplace won't need any special casing so I wouldn't need to do any changes here? In which case I think the outstanding changes would be a) the example you requested (I assume you mean a notebook in the examples folder?) and b) whatever changes you suggest to ensure compatibility with the self.noncentered parameter?

@tomicapretto
Copy link
Collaborator

Thanks very much @tomicapretto ! I'll have a look at those changes now. I also spotted there's some interaction with the centered/non-centered functionality going on here for the Normal link case:

def build_distribution(self, dist, label, **kwargs):
"""Build and return a PyMC Distribution."""
dist = get_distribution(dist)
if "dims" in kwargs:
group_dim = [dim for dim in kwargs["dims"] if dim.endswith("__expr_dim")]
kwargs = {
k: self.expand_prior_args(k, v, label, dims=group_dim) for (k, v) in kwargs.items()
}
else:
kwargs = {k: self.expand_prior_args(k, v, label) for (k, v) in kwargs.items()}
if self.noncentered and has_hyperprior(kwargs):
sigma = kwargs["sigma"]
offset = pm.Normal(label + "_offset", mu=0, sigma=1, dims=kwargs["dims"])
return pm.Deterministic(label, offset * sigma, dims=kwargs["dims"])
return dist(label, **kwargs)

From my experiments before the Laplace link has similar trade-offs to the Normal in terms of when to use the centered/non-centered parameterisation so perhaps this would also need modifying to switch between Normal and Laplace to implement both the centered/non-centered versions of both?

Good catch!

This is not an adjustment that happens only for the Normal family (i.e. using Normal likelihood function). This happens for all group-specific effects (i.e. the ones you get with 1 | group and variable | group syntax) no matter the family. So when you add the laplace family (i.e. Laplace likelihood), we will have this centered/non-centered parametrization for its group-specific effects as well.

@markgoodhead
Copy link
Contributor Author

Thanks very much @tomicapretto ! I'll have a look at those changes now. I also spotted there's some interaction with the centered/non-centered functionality going on here for the Normal link case:

def build_distribution(self, dist, label, **kwargs):
"""Build and return a PyMC Distribution."""
dist = get_distribution(dist)
if "dims" in kwargs:
group_dim = [dim for dim in kwargs["dims"] if dim.endswith("__expr_dim")]
kwargs = {
k: self.expand_prior_args(k, v, label, dims=group_dim) for (k, v) in kwargs.items()
}
else:
kwargs = {k: self.expand_prior_args(k, v, label) for (k, v) in kwargs.items()}
if self.noncentered and has_hyperprior(kwargs):
sigma = kwargs["sigma"]
offset = pm.Normal(label + "_offset", mu=0, sigma=1, dims=kwargs["dims"])
return pm.Deterministic(label, offset * sigma, dims=kwargs["dims"])
return dist(label, **kwargs)

From my experiments before the Laplace link has similar trade-offs to the Normal in terms of when to use the centered/non-centered parameterisation so perhaps this would also need modifying to switch between Normal and Laplace to implement both the centered/non-centered versions of both?

Good catch!

This is not an adjustment that happens only for the Normal family (i.e. using Normal likelihood function). This happens for all group-specific effects (i.e. the ones you get with 1 | group and variable | group syntax) no matter the family. So when you add the laplace family (i.e. Laplace likelihood), we will have this centered/non-centered parametrization for its group-specific effects as well.

Ah yes of course - my mistake!

In that case I think the only change outstanding is the example you requested. I took a look at the notebooks in the example folder and they each seem like quite detailed explanations of a particular type of statistical technique, whereas this is sort of more 'minor' (just another family) so I'm not sure if it's appropriate for me just to add a very small notebook here or if there's a better place the example should live?

@tomicapretto
Copy link
Collaborator

tomicapretto commented Jun 7, 2022

Thanks very much @tomicapretto ! I'll have a look at those changes now. I also spotted there's some interaction with the centered/non-centered functionality going on here for the Normal link case:

def build_distribution(self, dist, label, **kwargs):
"""Build and return a PyMC Distribution."""
dist = get_distribution(dist)
if "dims" in kwargs:
group_dim = [dim for dim in kwargs["dims"] if dim.endswith("__expr_dim")]
kwargs = {
k: self.expand_prior_args(k, v, label, dims=group_dim) for (k, v) in kwargs.items()
}
else:
kwargs = {k: self.expand_prior_args(k, v, label) for (k, v) in kwargs.items()}
if self.noncentered and has_hyperprior(kwargs):
sigma = kwargs["sigma"]
offset = pm.Normal(label + "_offset", mu=0, sigma=1, dims=kwargs["dims"])
return pm.Deterministic(label, offset * sigma, dims=kwargs["dims"])
return dist(label, **kwargs)

From my experiments before the Laplace link has similar trade-offs to the Normal in terms of when to use the centered/non-centered parameterisation so perhaps this would also need modifying to switch between Normal and Laplace to implement both the centered/non-centered versions of both?

Good catch!
This is not an adjustment that happens only for the Normal family (i.e. using Normal likelihood function). This happens for all group-specific effects (i.e. the ones you get with 1 | group and variable | group syntax) no matter the family. So when you add the laplace family (i.e. Laplace likelihood), we will have this centered/non-centered parametrization for its group-specific effects as well.

Ah yes of course - my mistake!

In that case I think the only change outstanding is the example you requested. I took a look at the notebooks in the example folder and they each seem like quite detailed explanations of a particular type of statistical technique, whereas this is sort of more 'minor' (just another family) so I'm not sure if it's appropriate for me just to add a very small notebook here or if there's a better place the example should live?

No need to add a notebook at this time.. Just something like the Bambi example in this PR #490

@aloctavodia
Copy link
Collaborator

I agree that there is no need to add a notebook as part of this PR, opening an issue as a remainder that the example is missing should be enough. Additionally, instead of writing a new notebook, we can maybe extend this one https://bambinos.github.io/bambi/main/notebooks/t_regression.html

@markgoodhead
Copy link
Contributor Author

OK I've got a similar example that (appears to me anyway) to work on this branch - is this what you were expecting?

import arviz as az
import bambi as bmb
import numpy as np
import pandas as pd

az.style.use("arviz-darkgrid")
rng = np.random.default_rng(0)

size = 1000
x = rng.normal(loc=10., scale=5., size=size)
data = pd.DataFrame(
    {
        "x": x,
        "y": rng.laplace(loc=x, size=size)
    }
)

bmb_model = bmb.Model("y ~ x", data, family="laplace")
idata = bmb_model.fit(random_seed=0)

bmb_model.predict(idata)
az.plot_trace(idata);

image

@tomicapretto tomicapretto marked this pull request as ready for review June 7, 2022 18:46
@tomicapretto
Copy link
Collaborator

I forgot to add that it would be good to have a test for this new family. Something like the examples attached would be good for now

def test_categorical_family(inhaler):
model = Model("rating ~ period + carry + treat", inhaler, family="categorical")
model.fit(draws=10, tune=10)

def test_t_regression():
data = pd.DataFrame({"y": np.random.normal(size=100), "x": np.random.normal(size=100)})
Model("y ~ x", data, family="t").fit(draws=10, tune=10)

@markgoodhead
Copy link
Contributor Author

I forgot to add that it would be good to have a test for this new family. Something like the examples attached would be good for now

def test_categorical_family(inhaler):
model = Model("rating ~ period + carry + treat", inhaler, family="categorical")
model.fit(draws=10, tune=10)

def test_t_regression():
data = pd.DataFrame({"y": np.random.normal(size=100), "x": np.random.normal(size=100)})
Model("y ~ x", data, family="t").fit(draws=10, tune=10)

Yes good idea, I've added one now based on a modified version of my example

@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2022

Codecov Report

Merging #524 (f2546a7) into main (762f30a) will decrease coverage by 0.08%.
The diff coverage is 69.23%.

@@            Coverage Diff             @@
##             main     #524      +/-   ##
==========================================
- Coverage   86.78%   86.69%   -0.09%     
==========================================
  Files          32       32              
  Lines        2573     2586      +13     
==========================================
+ Hits         2233     2242       +9     
- Misses        340      344       +4     
Impacted Files Coverage Δ
bambi/defaults/defaults.py 80.55% <ø> (ø)
bambi/families/likelihood.py 93.61% <ø> (ø)
bambi/families/univariate.py 86.59% <42.85%> (-3.41%) ⬇️
bambi/tests/test_built_models.py 98.95% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 762f30a...f2546a7. Read the comment docs.

Copy link
Collaborator

@tomicapretto tomicapretto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution! I hope it was a nice experience to you too.

If I can ask one more thing it would be comments about how easy/hard it was to work through this PR and what we could do to help you. Thanks again!

@tomicapretto tomicapretto merged commit ecfdbf2 into bambinos:main Jun 8, 2022
@markgoodhead
Copy link
Contributor Author

Thanks a lot for this contribution! I hope it was a nice experience to you too.

If I can ask one more thing it would be comments about how easy/hard it was to work through this PR and what we could do to help you. Thanks again!

Sure - I can't think of anything big that'd change to make the process easier. There's always friction learning a new codebase but this wasn't too bad - the code is well laid out and generally quite readable.

Only two quite minor things I can think of - you can add the guidelines (which mention black and pylint) to be displayed on a PR as you open it (or a link to them) as I didn't spot those until quite late. Also, I got caught when my version of black was too old and did some bad formatting, perhaps the versions of pylint and black used in github actions could be specified in the PR guidelines?

@tomicapretto
Copy link
Collaborator

@markgoodhead thanks for the feedback. I'll open an issue so we can update our guidelines. The versions of black and pylint are in requirements-dev.txt. We need to be clearer about that. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants