-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plotting conditional posteriors #41
Comments
Hi, thanks for getting in contact and offering help. This is a good reference for ICE (and other methods) https://christophm.github.io/interpretable-ml-book/ice.html
We can use ArviZ and the returned InferenceData in general (like for sampling diagnostics), but for the PDP and ICE plots we need to compute new predictions and for that, we need the fitted trees that are stored in the BART variable and not in the InferenceData. |
Thanks for your response! I am not sure if the I will have a go at using ArviZ and InferenceData and post here. |
In arviz if you use I think what you want is to fit |
I think variable selection would be useful to determine the predictive power of a variable as a whole, what I am after is examining how do predictors influence outcomes. In this example it would be I am realizing that the 'plot_posterior' example was wrong, it does only filter variables, it's been a while since I used pymc. I think a better example would be this. |
These are a few approaches to estimating how do predictors influence outcomes using PyMC-BART
Another potential approach could be to do something like https://github.com/yannmclatchie/kulprit, but we don't have a theory for BART for that (variable importance is only loosely inspired on that) |
Thank you for your response (and patience!).
Correct me if I am not getting it right - I think it's because (at least in my field) one often wants to understand the interactions, not select the variables. Following from the example on the tutorial, I plotted it using the ICE method. Focusing on humidity, one can see that there is some variability (tree paths), however it's not clear whether this is caused by influence of Thank you for pointing out the kulprit package! I have actually been looking for exactly that in Python for a while :) |
Focusing on humidity we can see that the pattern is essentially the same for all instances, it's just shifted up or down from the mean. This shows that there are no interactions (or at least we can not detect them). In other words, no matter at which values we fix the rest of the variables the effect of humidity on the rental of bikes seems to be the same. Flat (or slightly negative slope) at the begging followed by a slightly steeper (and negative) slope for a humidity higher than ~0.6. Understanding interactions is very relevant for us too. We have some ideas for making it more straightforward for users to do that, but unfortunately, we are still in the early development stage and we also need to test those ideas. Let me know if you are interested in testing those ideas on your own datasets and I will contact you when we have something ready. |
Thanks! Sure, I'd definitely be happy to try things on some of my datasets :) |
Hi! Thanks for creating this great package :)
I think one important aspect of understanding models is the ability to explore conditional posteriors. In the tutorial you mention the
kind="ice"
option, however, it is unclear how this can be used to systematically understand the model posterior. In arviz I'd for example use theplot_posterior()
with thefilter_vars
argument to explore interactions. Is there a similar way inpmc.plot_dependance()
? Or, can one easily use arviz with estimatedInferenceData
object?I think this would be a very handy addition to the documentation. Once I understand it I'd be happy to write an example.
The text was updated successfully, but these errors were encountered: