-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check sensitivity to priors specification #189
Comments
Based on my limited understanding, it seems that the "best" way (or the original default way) to use priors in Bayesian modelling would be to provide informative priors, based for instance on the literature or on previous data. At the same time, it seems that people tend to stick with the default, mildly informative, centred around 0, priors. I feel like one of the explanations (besides being "easier" and less costly) is that testing a model with the prior that there is an effect, although perfectly fine from a statistical and theoretical perspective, feels kinda awkward to the average psychologist; it feels like biasing your analysis toward the presence of an effect (often the Graal that we look for). That being said, I find interesting the idea of providing counter-intuitive priors, to give a sense of how much the data strongly speaks in favour of an effect. For instance, if I believe that the correlation x and y is positive, I would test the correlation with a prior of a negative correlation to see if, despite this prior, the posterior remains positive. While this is an interesting thing to do in one's analysis, I then wondered about the possibility of automating and "defaulting" such procedure. Basically, take a model, update the model with a prior opposite to the effect, and see how much the posterior has shifted. I recently made some advances on that, so I will probably start implementing a clean version of such (exploratory) function soon. I am also setting up some simulations to see how such index of sensitivity behaves. Will keep you updated :) Let me know if that makes sense to you or am I reinventing the wheel here 😄 |
This sound interesting - please keep us updated. I think the reason priors are kept relatively wide, and centered at zero, is because when priors are very informed, almost nothing is gained from the data, and researchers want to learn from the data, and not just test their hypotheses. So if we have a prior that is stronger than our data, the posterior will be mostly a reflection of the prior. This is actually quite beautiful in a phil-of-sci kind of way, where we want to learn, and leave room to being wrong. I think an alternative to testing different priors directly is using library(rstanarm)
library(bayestestR)
X <- rnorm(100)
Y <- X + rnorm(100, sd = 4)
df <- data.frame(X,Y)
fit <- stan_glm(Y ~ X, df, family = gaussian())
hyps <- c(
"X > 0",
"X < 0"
)
test <- bayesfactor_restricted(fit,hypothesis = hyps)
#> Computation of Bayes factors: sampling priors, please wait...
test
#> # Bayes Factor (Order-Restriction)
#>
#> Hypothesis P(Prior) P(Posterior) Bayes Factor
#> X > 0 0.5 0.81 1.63
#> X < 0 0.5 0.19 0.38
#>
#> * Bayes factors for the restricted movel vs. the un-restricted model.
test$BF[1] / test$BF[2]
#> [1] 4.273599 Created on 2019-07-01 by the reprex package (v0.3.0) Also, similar results from test_pd <- pd(fit)
test_pd$pd[2] / (100-test_pd$pd[2])
#> 4.12 |
brms' default are no prior definition, i.e. uninformative (flat) priors, afaik. |
Right, but this is given a particular initial prior... What if I want to see how this particular BF restricted value (or pd) is affected by a different prior? Anyway, currently the preliminary tests that I've run investigated the shift in point-estimates (the median). And the magnitude of this shift seems to capture both (i.e., is affected by both) sample size and noise in the data... resulting in an interesting "composite" index of "robustness". Further investigations are needed tho don't take my word for it 😬
I was merely suggesting default from a usage, rather than a software, perspective :) |
So you want to see how the posterior dist is robust to priors? Or BF? (note that the BF depends on two sets of priors - one for each model... So not sure how to even approach this) |
Ok, wasn't entirely sure what you were meaing ;-) |
It's totally plausible. I wonder indeed how such index of sensitivity related to other indices such as uncertainty and diagnostic measures. One way to verify is to check :) |
I saw somewhere that the Stan folks recommend a check: if the width of your prior is less than 10 times the width of you posterior, your prior is too informed. |
That's probably why they autoscale the priors by default... |
Yes, indeed. See also my "case study" here: https://www.researchgate.net/publication/322420462_Bayesian_Regression_Models_in_R_Choosing_informative_priors_in_rstanarm But it also depends on which parameters all get a prior. prior on the intercept might have a stronger impact. |
@strengejacke This is fantastic! Thanks! |
Alrighty! After some internal testing and tweaking, I finally came up with an index, sensitivity - 10 (S10). The 10 stands for the "magnitude", which is the number of SDs of a prior by which this prior is shifted in the opposite side of its median... (this magnitude, Example: Let's say the posterior has a median of 2. The initial prior was normal (0, 1.5). In this case, we will refit the same model with a prior of 1.5 (the SD) * 10 (the magnitude) * -1*sign(median). i.e., the new prior will be normal (-15, 1.5) (same SD, just the location changes). We then extract the "new index", for instance, the median, of this updated antagonistic model. Let's say that for this new model, the median of the posterior is now 1. The sensitivity index corresponds to a normalized difference, i.e., The minimum S10 is 0 (no shift in the index) and it has no maximum. However, here are some plots showing the distributions of 6000 observations related to the sample size and "noise" in a simple linear regression. Meaning of "noise":Just to give you an idea for what the noise values correspond to. Distributions related to noise groupsDistributions related to sample sizeEffect of sample size and noiseConclusionSensitivity to priors seems to be an interesting "composite" index of robustness of indices (i.e., capturing information of (being sensitive to) both noise and sample size). It can be easily applied to virtually all indices (point-estimates, uncertainty, significance and existence...), and can be interpreted quite easily (it is the ratio by which an index has shifted). Based on the distributions above, It seems that an S10 up to somewhere around 0.25/0.33 could be considered as "okay" (?)... I will upload the data and all on circus and the function to bayestestR |
I am not sure how/if this can be useful yet ^^ but with availability comes the usage :) (cf. find_distribution for which we found a usage not originally planned) |
Is this for true effects? or also null effects? Only for initial prior centered at zero? |
The initial effect (i.e., the median of the posterior of the original model) varied between 0 and 1 depending on the noise level. "Null effects" would have a pd close to 50%.
Didn't test others |
Don't think I understand - the real effect (from which the data was "sampled") was always non-zero?
Wonder if this would change anything... 🤔 |
I generated like a more or less noisy relationship with a million observations (corresponding to different coefs, see figure above showing different values for noise). Then I randomly took out smaller samples from that parent population. Resulting in an infinity of different coefs for that subsample (although the lesser the noise of the parent pop, the lesser the variability between sample sizes in the subsample) |
It would be interesting to include some methods to check the parameters sensitivity to different priors specification. I am not sure if there is any "standard default" method for doing so.
The text was updated successfully, but these errors were encountered: