-
-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add power-scaling sensitivity analysis diagnostic #2093
Conversation
@OriolAbril we probably should add prior_loglikelihood to InferenceData schema? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @n-kall! I left some comments, mix between more general and more specific, mostly to try and understand what is going on exactly.
Re storage of the inputs @ahartikainen. I think here what we need are the log_prior and the log_likelihood (not pointwise) of the posterior samples, so it would probably be best to add a log_prior
and log_likelihood
variables to sample_stats
(like we have lp
already, in fact, the sum of these two should be equal to lp
so it might make sense to store only the log_prior
for example and get the log likelihood by substracting it.
arviz/stats/diagnostics.py
Outdated
lower_cjs = _cjs_dist(x=draws, weights=lower_weights) | ||
upper_cjs = _cjs_dist(x=draws, weights=upper_weights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the x=draws
part be x=ary
? or the call signature be (draws, *...
?
Also, should x
and weights
have the same shape? I think they are currently nchain, ndraw
and nchain*ndraw
respectively.
In general, it would be helpful to try and keep the same names throughout the functions. e.g. use draws here and in cjs_dist instead of draws/ary/x depending on the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed everything to draws
and weights
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct to assume the internal functions assume everything to be 1D arrays of length nchain*ndraw
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct to assume the internal functions assume everything to be 1D arrays of length
nchain*ndraw
?
Yes, that should be the case. There's no differentiation between chains, so all draws are included in the 1D array.
arviz/stats/diagnostics.py
Outdated
# ecdfs | ||
cdf_p = np.full(shape=len(x), fill_value=1/len(x)) | ||
cdf_p = np.cumsum(cdf_p)[:-1] | ||
cdf_q = np.cumsum(w/np.sum(w))[:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make a difference to subset [1:]
instead? That is, get an array starting from the first element and ending at 1. I think now the first element is potentially 0 and can give raise to nans, so doing this would allow us to use sum instead of nansum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re storage of the inputs @ahartikainen. I think here what we need are the log_prior and the log_likelihood (not pointwise) of the posterior samples, so it would probably be best to add a
log_prior
andlog_likelihood
variables tosample_stats
(like we havelp
already, in fact, the sum of these two should be equal tolp
so it might make sense to store only thelog_prior
for example and get the log likelihood by substracting it.
In simple cases of power-scaling, the log_prior + log_likelihood = lp, but this is not true in general. For example, in a hierarchical model, we suggest only power-scaling the top-level prior (not intermediate), so this subtraction would not work (see paper for more details). In these cases, the 'log_prior' is not the joint global prior, but is only selected priors (which should somehow be specified by the modeller).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep the initial log_likelihood and log_prior proposals then.
is it possible to update the slicing though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to update the slicing though?
I think this might change the calculation of the metric as defined in https://link.springer.com/content/pdf/10.1007/978-3-319-23525-7_11.pdf (Definition 1, and section 2.3).
In the paper 0 * log 0 = 0 is defined (nansum takes care of this). In a quick test, changing the slicing resulted in quite a different output value.
Is it critical that we use np.sum instead of np.nansum? I can spend some more time thinking how to fix the calculation if needed (unless you have an idea already?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using nansum is both less performant and more obscure (e.g. actual nans would not be propagated resulting in potentially non-sensical results). I would try something like:
cdf_q = np.cumsum(weights/np.sum(weights))[:-1]
cdf_log_cdf_q = np.empty_like(cdf_q)
cdf_log_cdf_q[0] = 0 if np.isclose(0, cdf_q[0]) else cdf_p[0] * (np.log2(cdf_p[0])
cdf_log_cdf_q[1:] = cdf_p[1:] * (np.log2(cdf_p[1:])
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
@OriolAbril Thanks for all the comments and suggestions! I'll clean it up shortly |
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
I realise I left out an example of the imagined usage
|
I left two still a bit conceptual comments in the discussion above, but it already looks good. I'll leave some more specific comments below to start getting the PR merge ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't necessarly need to happen right now, but eventually it will need tests and being added to the changelog.
fix docs for psens Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com>
I've made some updates regarding the extraction of the log_likelihood and log_prior. Below is a working example using CmdStanPy. I think it might make sense to include log_prior as a group in inference data, just as log_likelihood is stored. The current code assumes there is a group called 'log_prior', which is manually added in the example. Stan model (
|
This PR is definitely a very good reason to include |
Isn't it enough to have a single total log prior value per sample? If so I would put it in sample stats like lp |
It can be useful to have the lprior as an array with different contributions to the joint log prior, so that it is possible to check the sensitivity to changing a subset of the priors. Currently this works with the 'selection' argument. The lprior is then an array of N_priors x N_draws. I'm not so familiar with the InferenceData scheme, so wherever this would make sense to be stored, just let me know and I can adjust the functions accordingly |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2093 +/- ##
==========================================
+ Coverage 86.70% 86.74% +0.04%
==========================================
Files 122 122
Lines 12640 12703 +63
==========================================
+ Hits 10959 11019 +60
- Misses 1681 1684 +3 ☔ View full report in Codecov by Sentry. |
I think it would be best to store it in an independent groups with the same variables as in the posterior and prior groups, but where instead of samples we store pointwise log prior values. This is very similar to what we already do for the pointwise log likelihood where we store the pointwise log likelihood values in their own group using the same variables names as in the posterior predictive and observed data groups. To get the per sample total log prior After loading the rugby dataset for example, the posterior is this: idata = az.load_arviz_data("rugby")
idata.posterior an xarray Dataset (dict like structure) with multiple variables, sharing some dimensions but each with its own independent shape:
To get them as a 2d array we can do: az.extract(idata).to_stacked_array("latent_var", sample_dims=("sample",))
And adding Imo, choosing of variables to include (or subsets of them) should happen in the first object which is what the users really know, these are the variables they define in their Stan or PyMC models. I am not sure about what API to use though. For model comparison, we chose to force users to select a single |
It seems xarray objects can't be kwargs with apply_ufunc, they should either be converter to numpy arrays or used as positional arguments. Also, here is the link to the docstring preview: https://arviz--2093.org.readthedocs.build/en/2093/api/generated/arviz.psens.html. The test aren't super exhaustive but I think they are good for now. |
@OriolAbril Thanks for all the help with this PR! Greatly appreciated! |
This PR add functionality for power-scaling sensitivity analysis.
Details of the approach are described in Kallioinen et al. (2022) https://arxiv.org/abs/2107.14054 and the corresponding R package (with additional functionality) is available at https://github.com/n-kall/priorsense
This PR adds the base functionality required to calculate the sensitivity diagnostic, which I have labelled 'psens' (feel free to change the naming). This diagnostic can be used to check whether there is sensitivity of the posterior to changes in the prior or likelihood without needing to refit the model.
In order to calculate the diagnostic, evaluations of the log_prior and log_likelihood at the points of the posterior draws are required. Currently, the log_likelihood is available, but the log_prior is not saved by PyMC. So, for psens to function properly, it would require a way to save the log_prior evaluations when fitting a model in PyMC.
📚 Documentation preview 📚: https://arviz--2093.org.readthedocs.build/en/2093/