Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add power-scaling sensitivity analysis diagnostic #2093

Merged
merged 48 commits into from
Dec 14, 2023

Conversation

n-kall
Copy link
Contributor

@n-kall n-kall commented Aug 11, 2022

This PR add functionality for power-scaling sensitivity analysis.
Details of the approach are described in Kallioinen et al. (2022) https://arxiv.org/abs/2107.14054 and the corresponding R package (with additional functionality) is available at https://github.com/n-kall/priorsense

This PR adds the base functionality required to calculate the sensitivity diagnostic, which I have labelled 'psens' (feel free to change the naming). This diagnostic can be used to check whether there is sensitivity of the posterior to changes in the prior or likelihood without needing to refit the model.

In order to calculate the diagnostic, evaluations of the log_prior and log_likelihood at the points of the posterior draws are required. Currently, the log_likelihood is available, but the log_prior is not saved by PyMC. So, for psens to function properly, it would require a way to save the log_prior evaluations when fitting a model in PyMC.


📚 Documentation preview 📚: https://arviz--2093.org.readthedocs.build/en/2093/

@n-kall n-kall changed the title Priorsense Add power-scaling sensitivity analysis diagnostic Aug 11, 2022
@n-kall n-kall changed the title Add power-scaling sensitivity analysis diagnostic [WIP] Add power-scaling sensitivity analysis diagnostic Aug 11, 2022
@ahartikainen
Copy link
Contributor

@OriolAbril we probably should add prior_loglikelihood to InferenceData schema?

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @n-kall! I left some comments, mix between more general and more specific, mostly to try and understand what is going on exactly.

Re storage of the inputs @ahartikainen. I think here what we need are the log_prior and the log_likelihood (not pointwise) of the posterior samples, so it would probably be best to add a log_prior and log_likelihood variables to sample_stats (like we have lp already, in fact, the sum of these two should be equal to lp so it might make sense to store only the log_prior for example and get the log likelihood by substracting it.

arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
Comment on lines 1074 to 1075
lower_cjs = _cjs_dist(x=draws, weights=lower_weights)
upper_cjs = _cjs_dist(x=draws, weights=upper_weights)
Copy link
Member

@OriolAbril OriolAbril Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the x=draws part be x=ary? or the call signature be (draws, *...?

Also, should x and weights have the same shape? I think they are currently nchain, ndraw and nchain*ndraw respectively.

In general, it would be helpful to try and keep the same names throughout the functions. e.g. use draws here and in cjs_dist instead of draws/ary/x depending on the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed everything to draws and weights

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct to assume the internal functions assume everything to be 1D arrays of length nchain*ndraw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct to assume the internal functions assume everything to be 1D arrays of length nchain*ndraw?

Yes, that should be the case. There's no differentiation between chains, so all draws are included in the 1D array.

arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
# ecdfs
cdf_p = np.full(shape=len(x), fill_value=1/len(x))
cdf_p = np.cumsum(cdf_p)[:-1]
cdf_q = np.cumsum(w/np.sum(w))[:-1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make a difference to subset [1:] instead? That is, get an array starting from the first element and ending at 1. I think now the first element is potentially 0 and can give raise to nans, so doing this would allow us to use sum instead of nansum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re storage of the inputs @ahartikainen. I think here what we need are the log_prior and the log_likelihood (not pointwise) of the posterior samples, so it would probably be best to add a log_prior and log_likelihood variables to sample_stats (like we have lp already, in fact, the sum of these two should be equal to lp so it might make sense to store only the log_prior for example and get the log likelihood by substracting it.

In simple cases of power-scaling, the log_prior + log_likelihood = lp, but this is not true in general. For example, in a hierarchical model, we suggest only power-scaling the top-level prior (not intermediate), so this subtraction would not work (see paper for more details). In these cases, the 'log_prior' is not the joint global prior, but is only selected priors (which should somehow be specified by the modeller).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep the initial log_likelihood and log_prior proposals then.

is it possible to update the slicing though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to update the slicing though?

I think this might change the calculation of the metric as defined in https://link.springer.com/content/pdf/10.1007/978-3-319-23525-7_11.pdf (Definition 1, and section 2.3).

In the paper 0 * log 0 = 0 is defined (nansum takes care of this). In a quick test, changing the slicing resulted in quite a different output value.
Is it critical that we use np.sum instead of np.nansum? I can spend some more time thinking how to fix the calculation if needed (unless you have an idea already?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using nansum is both less performant and more obscure (e.g. actual nans would not be propagated resulting in potentially non-sensical results). I would try something like:

cdf_q = np.cumsum(weights/np.sum(weights))[:-1]
cdf_log_cdf_q = np.empty_like(cdf_q)
cdf_log_cdf_q[0] = 0 if np.isclose(0, cdf_q[0]) else cdf_p[0] * (np.log2(cdf_p[0])
cdf_log_cdf_q[1:] = cdf_p[1:] * (np.log2(cdf_p[1:])

n-kall and others added 2 commits August 18, 2022 12:51
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
@n-kall
Copy link
Contributor Author

n-kall commented Aug 18, 2022

@OriolAbril Thanks for all the comments and suggestions! I'll clean it up shortly

n-kall and others added 7 commits August 18, 2022 12:58
@n-kall
Copy link
Contributor Author

n-kall commented Aug 18, 2022

I realise I left out an example of the imagined usage

import arviz as az
import numpy as np
import pymc as pm

draws = 2000
chains = 4

data = {"y" = np.array([1,2,3,4,5])}

with pm.Model() as model:
    mu = pm.Normal("mu", mu=0, sigma=1)
    sigma = pm.HalfNormal("sigma", sigma=2.5)
    pm.Normal("obs", mu=mu, sigma=sigma, observed=data["y"])
    post = pm.sample(draws, chains=chains)

az.psens(post, component="likelihood")

@OriolAbril
Copy link
Member

OriolAbril commented Aug 19, 2022

I left two still a bit conceptual comments in the discussion above, but it already looks good. I'll leave some more specific comments below to start getting the PR merge ready

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't necessarly need to happen right now, but eventually it will need tests and being added to the changelog.

arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
arviz/stats/diagnostics.py Outdated Show resolved Hide resolved
n-kall and others added 9 commits August 24, 2022 11:03
fix docs for psens

Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
@n-kall
Copy link
Contributor Author

n-kall commented Jan 11, 2023

I've made some updates regarding the extraction of the log_likelihood and log_prior. Below is a working example using CmdStanPy.

I think it might make sense to include log_prior as a group in inference data, just as log_likelihood is stored. The current code assumes there is a group called 'log_prior', which is manually added in the example.

Stan model (example.stan):

data {

  int N;
  vector[N] x;
}

parameters {

  real mu;
  real<lower=0> sigma;

}

model {

  target += normal_lpdf(mu | 0, 1);
  target += normal_lpdf(sigma | 0, 2.5);

  target += normal_lpdf(x | mu, sigma);

}

generated quantities {

  vector[2] lprior;

  vector[N] log_lik;

  lprior[1] = normal_lpdf(mu | 0, 1);
  lprior[2] = normal_lpdf(sigma | 0, 2.5);

  for (n in 1:N) {
    log_lik[n] = normal_lpdf(x[n] | mu, sigma);
  }

}
import cmdstanpy as stan
import arviz as az
import numpy as np

model = stan.CmdStanModel(stan_file = "example.stan")

dat = {"sigma" : 1, "N" : 5, "x" : [1, 0.5, 3, 5, 10]}
fit = model.sample(data=dat)
idata = az.from_cmdstanpy(fit)
lprior = az.extract(idata, var_names="lprior", group="posterior", combined=False)

idata.add_groups(
    log_prior={"lprior": lprior}
)

# prior sensitivity (for all priors)
az.stats.psens(idata, component="prior", var_names=["mu", "sigma"])

# likelihood sensitivity (for joint likelihood)
az.stats.psens(idata, component="likelihood", var_names=["mu", "sigma"])

# prior sensitivity for prior on mu
az.stats.psens(idata, component="prior", var_names=["mu", "sigma"], selection=[0])

# likelihood sensitivity for single observation
az.stats.psens(idata, component="likelihood", var_names=["mu", "sigma"], selection=[1])

@aloctavodia
Copy link
Contributor

This PR is definitely a very good reason to include log_prior as a group in inference data. It seems you have some pylint issues to solve

@OriolAbril
Copy link
Member

Isn't it enough to have a single total log prior value per sample? If so I would put it in sample stats like lp

@n-kall
Copy link
Contributor Author

n-kall commented Jan 13, 2023

It can be useful to have the lprior as an array with different contributions to the joint log prior, so that it is possible to check the sensitivity to changing a subset of the priors. Currently this works with the 'selection' argument. The lprior is then an array of N_priors x N_draws. I'm not so familiar with the InferenceData scheme, so wherever this would make sense to be stored, just let me know and I can adjust the functions accordingly

@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (22c0dcb) 86.70% compared to head (432956a) 86.74%.

Files Patch % Lines
arviz/stats/stats_utils.py 80.00% 2 Missing ⚠️
arviz/stats/stats.py 98.18% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2093      +/-   ##
==========================================
+ Coverage   86.70%   86.74%   +0.04%     
==========================================
  Files         122      122              
  Lines       12640    12703      +63     
==========================================
+ Hits        10959    11019      +60     
- Misses       1681     1684       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@OriolAbril
Copy link
Member

I think it would be best to store it in an independent groups with the same variables as in the posterior and prior groups, but where instead of samples we store pointwise log prior values. This is very similar to what we already do for the pointwise log likelihood where we store the pointwise log likelihood values in their own group using the same variables names as in the posterior predictive and observed data groups.

To get the per sample total log prior to_stacked_dataarray can be used. For example:

After loading the rugby dataset for example, the posterior is this:

idata = az.load_arviz_data("rugby")
idata.posterior

an xarray Dataset (dict like structure) with multiple variables, sharing some dimensions but each with its own independent shape:

<xarray.Dataset>
Dimensions:    (chain: 4, draw: 500, team: 6)
Coordinates:
  * chain      (chain) int64 0 1 2 3
  * draw       (draw) int64 0 1 2 3 4 5 6 7 ... 492 493 494 495 496 497 498 499
  * team       (team) object 'Wales' 'France' 'Ireland' ... 'Italy' 'England'
Data variables:
    home       (chain, draw) float64 0.1642 0.1162 0.09299 ... 0.148 0.2265
    intercept  (chain, draw) float64 2.893 2.941 2.939 ... 2.951 2.903 2.892
    atts_star  (chain, draw, team) float64 0.1673 0.04184 ... -0.4652 0.02878
    defs_star  (chain, draw, team) float64 -0.03638 -0.04109 ... 0.7136 -0.0649
    sd_att     (chain, draw) float64 0.4854 0.1438 0.2139 ... 0.2883 0.4591
    sd_def     (chain, draw) float64 0.2747 1.033 0.6363 ... 0.5574 0.2849
    atts       (chain, draw, team) float64 0.1063 -0.01913 ... -0.2911 0.2029
    defs       (chain, draw, team) float64 -0.06765 -0.07235 ... 0.5799 -0.1986

To get them as a 2d array we can do:

az.extract(idata).to_stacked_array("latent_var", sample_dims=("sample",))
<xarray.DataArray 'home' (sample: 2000, latent_var: 28)>
array([[ 0.16416114,  2.89297934,  0.16729464, ...,  0.2082224 ,
         0.55365865, -0.22098263],
...
       [ 0.22646978,  2.89236378,  0.0439508 , ...,  0.13586028,
         0.57993105, -0.19855837]])
Coordinates:
  * sample      (sample) object MultiIndex
  * chain       (sample) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 3 3 3 3 3 3 3 3 3 3 3
  * draw        (sample) int64 0 1 2 3 4 5 6 7 ... 493 494 495 496 497 498 499
  * latent_var  (latent_var) object MultiIndex
  * variable    (latent_var) object 'home' 'intercept' ... 'defs' 'defs'
  * team        (latent_var) object nan nan 'Wales' ... 'Italy' 'England'

And adding .sum("latent_var") would get the total per sample log prior if these were pointwise log prior contributions.

Imo, choosing of variables to include (or subsets of them) should happen in the first object which is what the users really know, these are the variables they define in their Stan or PyMC models.

I am not sure about what API to use though. For model comparison, we chose to force users to select a single var_name (unlike all other functions that accept lists too) because ArviZ a priori doesn't know how to combine the values into pointwise log likelihood if there are multiple variables, as we could consider loo with multivariate observations, loo with univariate observations, even logo... Here however, we only want the per sample values, so summing over everything but the sampling dimensions sounds like a good default.

@OriolAbril
Copy link
Member

imatge

There is some env dependent effect that triggers a recursion error. Will continue looking into it tomorrow

@OriolAbril
Copy link
Member

It seems xarray objects can't be kwargs with apply_ufunc, they should either be converter to numpy arrays or used as positional arguments.

Also, here is the link to the docstring preview: https://arviz--2093.org.readthedocs.build/en/2093/api/generated/arviz.psens.html. The test aren't super exhaustive but I think they are good for now.

@n-kall
Copy link
Contributor Author

n-kall commented Dec 1, 2023

@OriolAbril Thanks for all the help with this PR! Greatly appreciated!

@OriolAbril OriolAbril merged commit 08fded0 into arviz-devs:main Dec 14, 2023
6 of 9 checks passed
@aloctavodia aloctavodia mentioned this pull request Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants