-
Notifications
You must be signed in to change notification settings - Fork 0
/
13.Rmd
1613 lines (1237 loc) · 67.5 KB
/
13.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Chapter 13. Goals, Power, and Sample Size"
author: "A Solomon Kurz"
date: "`r format(Sys.Date())`"
output:
github_document
---
```{r, echo = F}
knitr::opts_chunk$set(fig.retina = 2.5)
knitr::opts_chunk$set(fig.align = "center")
```
# Goals, Power, and Sample Size
> Researchers collect data in order to achieve a goal. Sometimes the goal is to show that a suspected underlying state of the world is credible; other times the goal is to achieve a minimal degree of precision on whatever trends are observed. Whatever the goal, it can only be probabilistically achieved, as opposed to definitely achieved, because data are replete with random noise that can obscure the underlying state of the world. Statistical *power* is the probability of achieving the goal of a planned empirical study, if a suspected underlying state of the world is true. [@kruschkeDoingBayesianData2015, p. 359, *emphasis* in the original]
## The will to power
"In this section, [Kruschke laid out a] framework for research and data analysis [that might lead] to a more precise definition of power and how to compute it" (p. 360).
### Goals and obstacles.
The three research goals Kruschke dealt with in this chapter were:
* to reject a null value for a parameter,
* to confirm the legitimacy of a particular parameter value, and
* to estimate a parameter with reasonable precision.
All these could, of course, be extended to contexts involving multiple parameters and all of these were dealt with using 95% HDIs.
### Power.
> Because of random noise, the goal of a study can be achieved only probabilistically. The probability of achieving the goal, given the hypothetical state of the world and the sampling plan, is called the *power* of the planned research. In traditional null hypothesis significance testing (NHST), power has only one goal (rejecting the null hypothesis), and there is one conventional sampling plan (stop at predetermined sample size) and the hypothesis is only a single specific value of the parameter. In traditional statistics, that is *the* definition of power. That definition is generalized in this book to include other goals, other sampling plans, and hypotheses that involve an entire distribution on parameters. (p. 361, *emphasis* in the original)
Three primary methods to increase power are:
* reducing measurement error,
* increasing the effect size, and
* increasing the sample size.
Kruschke then laid out a five-step procedure to compute power within a Bayesian workflow.
1. Use theory/prior information to specify hypothetical distributions for all parameter values in the model.
2. Use those distributions to generate synthetic data according to the planned sampling method.
3. Fit the proposed model--including the relevant priors--with the synthetic data.
4. Use the posterior to determine whether we attained the research goal.
5. Repeat the procedure many times (i.e., using different `set.seed()` values) to get a distribution of results.
### Sample size.
> *The best that a large sample can do is exactly reflect the data-generating distribution.* If the data-generating distribution has considerable mass straddling the null value, then the best we can do is get estimates that include and straddle the null value. As a simple example, suppose that we think that a coin may be biased, and the data-generating hypothesis entertains four possible values of $\theta$ , with $p (\theta = 0.5) = 25 \%$, $p (\theta = 0.6) = 25 \%$, $p (\theta = 0.7) = 25 \%$, and $p (\theta = 0.8) = 25 \%$. Because $25 \%$ of the simulated data come from a fair coin, the maximum probability of excluding $\theta = 0.5$, even with a huge sample, is $75 \%$.
>
> Therefore, when planning the sample size for an experiment, it is crucial to decide what a realistic goal is. If there are good reasons to posit a highly certain data-generating hypothesis, perhaps because of extensive previous results, then a viable goal may be to exclude a null value. On the other hand, if the data-generating hypothesis is somewhat vague, then a more reasonable goal is to attain a desired degree of precision in the posterior. (p. 364, *emphasis* in the original)
### Other expressions of goals.
I'm going to skip over these.
> In the remainder of the chapter, it will be assumed that the goal of the research is estimation of the parameter values, starting with a viable prior. The resulting posterior distribution is then used to assess whether the goal was achieved. (p. 366)
## Computing power and sample size
> As our first worked-out example, consider the simplest case: Data from a single coin. Perhaps we are polling a population and we want to precisely estimate the preferences for candidates A or B. Perhaps we want to know if a drug has more than a 50% cure rate. (p. 366)
### When the goal is to exclude a null value.
> Usually it is more intuitively accessible to get prior data, or to think of idealized prior data, than to directly specify a distribution over parameter values. For example, based on knowledge about the application domain, we might have 2000 actual or idealized flips of the coin for which the result showed 65% heads. Therefore we'll describe the data-generating hypothesis as a beta distribution with a mode of 0.65 and concentration based on 2000 flips after a uniform "proto-prior": $\operatorname{beta} (\theta | 0.65 \cdot (2000 - 2) + 1, (1 - 0.65) \cdot (2000 - 2) + 1)$.
We might express that in code like this.
```{r, fig.width = 4, fig.height = 2, message = F, warning = F}
library(tidyverse)
kappa <- 2000
omega <- .65
tibble(theta = seq(from = 0, to = 1, by = .001)) %>%
mutate(prior = dbeta(theta,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)) %>%
ggplot(aes(x = theta, ymin = 0, ymax = prior)) +
geom_ribbon(size = 0, fill = "grey67") +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = "Behold our prior. It's rather peaked.",
x = expression(theta)) +
theme(panel.grid = element_blank())
```
If we wanted to take some random draws from that prior, say 5, we'd do something like this.
```{r}
n <- 5
set.seed(13)
rbeta(n,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)
```
Now let's just take one draw and call it `bias`.
```{r}
n <- 1
set.seed(13)
bias <-
rbeta(n,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)
print(bias)
```
Do note that whereas Kruschke based his discussion on a bias of 0.638, we're moving forward with our randomly-drawn `r round(bias, 3)`. Anyways, now we
> simulate flipping a coin with that bias $N$ times. The simulated data have $z$ heads and $N − z$ tails. The proportion of heads, $z/N$, will tend to be around [`r round(bias, 3)`], but will be higher or lower because of randomness in the flips. (p. 367)
```{r}
# pick some large number
n <- 1e3
set.seed(13)
tibble(flips = rbernoulli(n = n, p = bias)) %>%
summarise(n = n(),
z = sum(flips)) %>%
mutate(`proportion of heads` = z / n)
```
And indeed our samples did tend around $\theta =.643$. Had we increased our number of draws by an order of magnitude or two, it our proportion of heads would have been even closer to the true data-generating value.
Though he presented Table 13.1 in this section, Kruschke walked out how he came to those values in the following sections. We'll get to them in just a bit.
### Formal solution and implementation in R.
I've been playing around with this a bit. If you look closely at the code block on page 369, you'll see that Kruschke's `minNforHDIpower()` function requires the `HDIofICDF()` function from his `DBDA2E-utilities.R` file, which we usually recast as `hdi_of_icdf()`.
```{r}
hdi_of_icdf <- function(name, width = .95, tol = 1e-8, ... ) {
incredible_mass <- 1.0 - width
interval_width <- function(low_tail_prob, name, width, ...) {
name(width + low_tail_prob, ...) - name(low_tail_prob, ...)
}
opt_info <- optimize(interval_width, c(0, incredible_mass),
name = name, width = width,
tol = tol, ...)
hdi_lower_tail_prob <- opt_info$minimum
return(c(name(hdi_lower_tail_prob, ...),
name(width + hdi_lower_tail_prob, ...)))
}
```
Just to warm up, consider a beta distribution for which $\omega = .5$ and $\kappa = 2000$. Here are the 95% HDIs.
```{r}
omega <- .5
hdi_of_icdf(name = qbeta,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)
```
Those look a whole lot like the ROPE values Kruschke specified in his example at the bottom of page 370. But we're getting ahead of ourselves. Now that we have our `hdi_of_icdf()` function, we're ready to define our version of `minNforHDIpower()`, which I'm calling `min_n_for_hdi_power()`.
```{r}
min_n_for_hdi_power <-
function(gen_prior_mode, gen_prior_n,
hdi_max_width = NULL, null_value = NULL,
rope = c(max(0, null_value - 0.02), min(1, null_value + 0.02)),
desired_power = 0.8, aud_prior_mode = 0.5, aud_prior_n = 2,
hdi_mass = 0.95, init_samp_size = 20, verbose = TRUE) {
# Check for argument consistency:
if (!xor(is.null(hdi_max_width), is.null(null_value))) {
stop("One and only one of `hdi_max_width` and `null_value` must be specified.")
}
# Convert prior mode and N to a, b parameters of beta distribution:
gen_prior_a <- gen_prior_mode * (gen_prior_n - 2) + 1
gen_prior_b <- (1.0 - gen_prior_mode) * (gen_prior_n - 2) + 1
aud_prior_a <- aud_prior_mode * (aud_prior_n - 2) + 1
aud_prior_b <- (1.0 - aud_prior_mode) * (aud_prior_n - 2) + 1
# Initialize loop for incrementing `sample_size`:
sample_size <- init_samp_size
not_powerful_enough = TRUE
# Increment `sample_size` until desired power is achieved:
while(not_powerful_enough) {
z_vec <- 0:sample_size # vector of all possible z values for N flips.
# Compute probability of each z value for data-generating prior:
p_z_vec <- exp(lchoose(sample_size, z_vec)
+ lbeta(z_vec + gen_prior_a, sample_size - z_vec + gen_prior_b)
- lbeta(gen_prior_a, gen_prior_b))
# For each z value, compute posterior HDI:
# `hdi_matrix` will hold HDI limits for each z:
hdi_matrix <- matrix(0, nrow = length(z_vec), ncol = 2)
for (z_id_x in 1:length(z_vec)) {
z <- z_vec[z_id_x]
hdi_matrix[z_id_x, ] <- hdi_of_icdf(qbeta,
shape1 = z + aud_prior_a,
shape2 = sample_size - z + aud_prior_b,
width = hdi_mass)
}
# Compute HDI widths:
hdi_width <- hdi_matrix[, 2] - hdi_matrix[, 1]
# Sum the probabilities of outcomes with satisfactory HDI widths:
if (!is.null(hdi_max_width)) {
power_hdi <- sum(p_z_vec[hdi_width < hdi_max_width])
}
# Sum the probabilities of outcomes with HDI excluding `rope`:
if (!is.null(null_value)) {
power_hdi <- sum(p_z_vec[hdi_matrix[, 1] > rope[2] | hdi_matrix[, 2] < rope[1]])
}
if (verbose) {
cat(" For sample size = ", sample_size, ", power = ", power_hdi,
"\n", sep = ""); flush.console()
}
if (power_hdi > desired_power) { # If desired power is attained,
not_powerful_enough = FALSE
} else {
sample_size <- sample_size + 1
# set flag to stop,
# otherwise
# increment the sample size.
}
} # End while( not_powerful_enough ).
# Return the sample size that achieved the desired power:
return(sample_size)
}
```
Other than altering Kruschke's formatting a little bit, the only meaningful change I made to the code was removing the line that checked for the `HDIofICD()` function and then `source()`ed it, if necessary. Following along with Kruschke on page 370, here's an example for which $\omega_\text{data generating} = .75$, $\kappa = 2000$, the ROPE is $[.48, .52]$, and the desired power is the conventional .8.
```{r}
min_n_for_hdi_power(gen_prior_mode = .75,
gen_prior_n = 2000,
hdi_max_width = NULL,
null_value = .5,
rope = c(.48, .52),
desired_power = .8,
aud_prior_mode = .5,
aud_prior_n = 2,
hdi_mass = .95,
init_samp_size = 20,
verbose = TRUE)
```
Just like in the text, the necessary $N = 30$.
Unlike in the text, I increased the value of `init_samp_size` from 5 to 20 to keep the output a reasonable length. To clarify what we just did,
> in that function call, the data-generating distribution has a mode of 0.75 and concentration of 2000, which means that the hypothesized world is pretty certain that coins have a bias of 0.75. The goal is to exclude a null value of 0.5 with a ROPE from 0.48 to 0.52. The desired power [is] 80%. The audience prior is uniform. When the function is executed, it displays the power for increasing values of sample size, until stopping at $N = 30$. (p. 370)
If it's unclear why the "audience prior is uniform", consider this.
```{r, fig.width = 4, fig.height = 2}
kappa <- 2
omega <- .5
tibble(theta = seq(from = 0, to = 1, by = .01)) %>%
mutate(prior = dbeta(theta,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)) %>%
ggplot(aes(x = theta, ymin = 0, ymax = prior)) +
geom_ribbon(size = 0, fill = "grey67") +
scale_y_continuous(NULL, breaks = NULL) +
coord_cartesian(ylim = c(0, 1.25)) +
labs(title = "Behold the uniform audience prior.",
x = expression(theta)) +
theme(panel.grid = element_blank())
```
If you work out the algebra with `omega` and `kappa`, you'll see this is a $\operatorname{beta} (1, 1)$. Thus, `aud_prior_n` is $\kappa$ and `aud_prior_mode` is $\omega$.
Here we'll wrap our `min_n_for_hdi_power()` function into a simple `sim_power()` function for use with `purrr::map2()`.
```{r}
sim_power <- function(mode, power) {
min_n_for_hdi_power(gen_prior_mode = mode,
gen_prior_n = 2000,
hdi_max_width = NULL,
null_value = .5,
rope = c(.48, .52),
desired_power = power,
aud_prior_mode = .5,
aud_prior_n = 2,
hdi_mass = .95,
init_samp_size = 1,
verbose = TRUE)
}
```
Here we use the two functions to compute the values in Table 13.1 on page 367.
```{r sim, results = "hide"}
sim <-
crossing(mode = seq(from = .6, to = .85, by = .05),
power = c(.7, .8, .9)) %>%
mutate(results = map2_dbl(mode, power, sim_power))
```
The results look like this.
```{r}
print(sim)
```
It takes just a tiny bit of wrangling to reproduce Table 13.1.
```{r}
sim %>%
pivot_wider(names_from = mode,
values_from = results) %>%
knitr::kable()
```
### When the goal is precision.
Recall that if we have $\operatorname{beta} (a, b)$ prior for $\theta$ of the Bernoulli likelihood function, then the analytic solution for the posterior is $\operatorname{beta} (\theta | z + a, N – z + b)$. In our first example, $z = 6$ out of $N = 10$ randomly selected voters preferred candidate A and we started with a flat $\operatorname{beta} (\theta | 1, 1)$ prior. We can check that our posterior is indeed $\operatorname{beta} (7, 5)$ by working through the algebra.
```{r}
z <- 6
n <- 10
# posterior alpha
z + 1
# posterior beta
n - z + 1
```
Here's how we compute the 95% HDIs.
```{r}
(
h <-
hdi_of_icdf(name = qbeta,
shape1 = 7,
shape2 = 5)
)
```
The $\operatorname{beta} (7, 5)$ distribution looks like this.
```{r, fig.width = 4, fig.height = 2}
tibble(theta = seq(from = 0, to = 1, by = .01)) %>%
mutate(density = dbeta(theta,
shape1 = 7,
shape2 = 5)) %>%
ggplot(aes(x = theta, ymin = 0, ymax = density)) +
geom_ribbon(size = 0, fill = "grey67") +
geom_segment(x = h[1], xend = h[2],
y = 0, yend = 0,
size = 1) +
annotate(geom = "text", x = .6, y = 1/3, label = "95% HDI") +
scale_x_continuous(NULL, breaks = c(0, h[1], z / n, h[2], 1) %>% round(2)) +
scale_y_continuous(NULL, breaks = NULL) +
ggtitle(expression("beta"*(7*", "*5))) +
theme(panel.grid = element_blank())
```
"It turns out, in this case, that we can never have a sample size large enough to achieve the goal of 80% of the HDIs falling above $\theta = 0.5$. To see why," keep reading in the text (p. 371). Happily,
> there is a more useful goal, however. Instead of trying to reject a particular value of $\theta$, we set as our goal a desired degree of precision in the posterior estimate. For example, our goal might be that the 95% HDI has width less than 0.2, at least 80% of the time. (p. 371)
If you look back up at our `min_n_for_hdi_power()` defining code, above, you'll see that "One and only one of `hdi_max_width` and `null_value` must be specified." So if we want to determine the necessary $N$ for an 95% HDI width of less than .2, we need to set `hdi_max_width = .2` and `null_value = NULL`.
```{r}
min_n_for_hdi_power(gen_prior_mode = .75,
gen_prior_n = 10,
hdi_max_width = .2, # look here
null_value = NULL,
rope = NULL,
desired_power = .8,
aud_prior_mode = .5,
aud_prior_n = 2,
hdi_mass = .95,
init_samp_size = 75,
verbose = TRUE)
```
Just like in the last section, here I set `init_samp_size` to a higher value than in the text in order to keep the output reasonably short. To reproduce the results in Table 13.2, we’ll need to adjust the `min_n_for_hdi_power()` parameters within our `sim_power()` function.
```{r, results = "hide"}
sim_power <- function(mode, power) {
min_n_for_hdi_power(gen_prior_mode = mode,
gen_prior_n = 10,
hdi_max_width = .2,
null_value = NULL,
rope = NULL,
desired_power = power,
aud_prior_mode = .5,
aud_prior_n = 2,
hdi_mass = .95,
init_samp_size = 50,
verbose = TRUE)
}
sim <-
crossing(mode = seq(from = .6, to = .85, by = .05),
power = c(.7, .8, .9)) %>%
mutate(results = map2_dbl(mode, power, sim_power))
```
Let's make that table.
```{r}
sim %>%
pivot_wider(names_from = mode,
values_from = results) %>%
knitr::kable()
```
What did that audience prior look like?
```{r, fig.width = 4, fig.height = 2}
kappa <- 2
omega <- .5
tibble(theta = seq(from = 0, to = 1, by = .1)) %>%
mutate(density = dbeta(theta,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)) %>%
ggplot(aes(x = theta, ymin = 0, ymax = density)) +
geom_ribbon(size = 0, fill = "grey67") +
scale_y_continuous(NULL, breaks = NULL, limits = c(0, 1.25)) +
labs(title = "Behold the uniform audience prior.",
x = expression(theta)) +
theme(panel.grid = element_blank())
```
Here are what the beta distributions based on the `sim` look like.
```{r, fig.width = 8, fig.height = 2.5}
sim %>%
rename(n = results) %>%
expand(nesting(mode, power, n),
theta = seq(from = 0, to = 1, by = .01)) %>%
mutate(density = dbeta(theta,
shape1 = mode * (n - 2) + 1,
shape2 = (1 - mode) * (n - 2) + 1),
mode = str_c("omega == ", mode)) %>%
ggplot(aes(x = theta, ymin = 0, ymax = density)) +
geom_vline(xintercept = .5, color = "white") +
geom_ribbon(size = 0, fill = "grey67") +
scale_x_continuous(expression(theta), labels = c("0", "", ".5", "", "1")) +
scale_y_continuous(NULL, breaks = NULL) +
ggtitle("The power and mode values are in the rows and columns, respectively.") +
theme(panel.grid = element_blank()) +
facet_grid(power ~ mode, labeller = label_parsed)
```
Toward the end of the section, Kruschke mentioned the required sample size shoots up if our desired HDI width is 0.1. Here's the simulation.
```{r sim_3, results = "hide"}
sim_power <- function(mode, power) {
min_n_for_hdi_power(gen_prior_mode = mode,
gen_prior_n = 10,
hdi_max_width = .1,
null_value = NULL,
rope = NULL,
desired_power = power,
aud_prior_mode = .5,
aud_prior_n = 2,
hdi_mass = .95,
init_samp_size = 300, # save some time and up this parameter
verbose = TRUE)
}
sim <-
crossing(mode = seq(from = .6, to = .85, by = .05),
power = c(.7, .8, .9)) %>%
mutate(results = map2_dbl(mode, power, sim_power))
```
Display the results in a table like before.
```{r}
sim %>%
pivot_wider(names_from = mode,
values_from = results) %>%
knitr::kable()
```
### Monte Carlo approximation of power.
> The previous sections illustrated the ideas of power and sample size for a simple case in which the power could be computed by mathematical derivation. [If your field is like mine, this will not be the norm for your research projects.] In this section, we approximate the power by Monte Carlo simulation. The R script for this simple case serves as a template for more realistic applications. The R script is named `Jags-Ydich-Xnom1subj-MbernBeta-Power.R`, which is the name for the JAGS program for dichotomous data from a single "subject" suffixed with the word "Power." As you read through the script, presented below, remember that you can find information about any general R command by using the help function in R, as explained in Section 3.3.1 (p. 39). (p. 372)
The code in Kruschke's `Jags-Ydich-Xnom1subj-MbernBeta-Power.R` file also makes use of the content in his `Jags-Ydich-Xnom1subj-MbernBeta.R` file. As is often the case, the code in both is JAGS and base-**R** centric. We'll be taking a different approach. I'll walk you through. First, let's fire up **brms**.
```{r, warning = F, message = F}
library(brms)
```
This won't be of much concern for some of the complex models we'll be fitting in later chapters. But for simple models like this, a lot of the time you spend waiting for `brms::brm()` to return your posterior and its summary has to do with compilation time. The issue of compilation goes into technical details I just don't have the will to go through right now. But if we can avoid or minimize compilation time, it'll be a boon for our power simulations. As it turns out, we can. The first time we fit our desired model, we have to compile. But once we have that initial fit object in hand, we can reuse it with the `update()` function, which will allow us to avoid further compilation. So that's what we’re going to do, here. We're going to fit the model once and save it.
```{r fit13.1}
# how many rows of 0s and 1s should we have in the data?
n <- 74
# should the values in the data be of single draws (i.e., 1), or summaries?
size <- 1
# what is the population mode for theta we'd like to base this all on?
omega <- .7
# fit that joint
fit13.1 <-
brm(data = tibble(y = rbinom(n, size, omega)),
family = bernoulli(link = identity),
y ~ 1,
prior(beta(1, 1), class = Intercept),
warmup = 1000, iter = 3000, chains = 4, cores = 1,
seed = 13,
control = list(adapt_delta = .9),
file = "fits/fit13.01")
```
You may (or not) recall that at the end of Chapter 3, we covered how to time an operation in **R**. When you're setting up a Monte Carlo power study, it can be important to use those time-tracking skills to get a sense of how long it takes to fit your models. While I was setting this model up, I experimented with keeping the default `cores = 1` or setting my typical `cores = 4`. As it turns out, with a very simple model like this, `cores = 1` was a little faster. If you're fitting one model, that's no big deal. But in a situation where you're fitting 100 or 1000, you'll want to make sure you're fitting them as efficiently as possible.
But anyway, our practice will be to keep all the specifications in `fit` constant across the simulations. So choose them wisely. If you look deep into the bowels of the `Jags-Ydich-Xnom1subj-MbernBeta.R` file, you'll see Kruschke used the flat $\operatorname{beta}(1, 1)$ prior, which is where our `prior(beta(1, 1), class = Intercept)` code came from. This is the audience prior. We aren't particularly concerned about the data we simulated with the `data = tibble(y = rbinom(n, size, omega))` line. The main thing is that they follow the same basic structure our subsequent data will.
To make sure we're not setting ourselves up to fail, we might make sure the chains look okay.
```{r, fig.width = 8, fig.height = 1.5}
plot(fit13.1)
```
Looks like a dream. Let's move forward and run the simulation proper. In his script file, Kruschke suggested we simulate with large $N$s like 1000 or so. Since this is just an example, I'm gonna cut that to 100.
```{r, eval = F}
# how many simulations would you like?
n_sim <- 100
# specify omega and kappa of the hypothetical parameter distribution
omega <- .7
kappa <- 2000
# make it reproducible
set.seed(13)
sim1 <-
# define some of the parameters
tibble(n = n,
size = size,
theta = rbeta(n_sim,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)) %>%
# simulate the data
mutate(data = pmap(list(n, size, theta), rbinom)) %>%
# fit the models on the simulated data
mutate(fit = map(data, ~update(fit13.1, newdata = list(y = .))))
```
```{r, echo = F}
# Even after reducing `sim1` through `sim5` by an order of magnitude, they take a long time to
# complete. So instead of completing them on the fly for each new version of this project, I've
# saved the results of `sim1` through `sim5` in an external file.
# save(list = c("sim1", "sim2", "sim3", "sim4", "sim5"), file = "13_sims/power_sims.rda")
load("13_sims/power_sims.rda")
# for later on, we also need
n_sim <- 100
```
*What have we done?* you might ask.
```{r}
head(sim1)
```
The `theta` column contains the draws from the hypothesized parameter distribution, which we've indicated is hovering tightly around .7. The `data` column is nested in that within each row, we've saved an entire $N = 74$ row tibble. Most importantly, the `fit` column contains the `brms::brm()` objects for each of our 100 simulations. See that last `mutate()` line, above? That's where those came from. Within the `purrr::map()` function, we fed our simulated data sets, one row at a time, into the `update()` function via the `newdata` argument. Because we used `update()` based on our initial `fit`, we avoided subsequent compilation times and just sampled like a boss.
Before we move on, I should give some credit. The foundations of this workflow come from Wickham's talk, [*Managing many models with R*](https://www.youtube.com/watch?time_continue=426&v=rz3_FDVt9eg). I got some additional [help on twitter](https://twitter.com/PStrafo/status/1107447953709383681) from [Phil Straforelli](https://twitter.com/PStrafo).
We still have some work to do. Next, we'll want to make a custom function that will make it easy to compute the intercept HDIs for each of our fits.
```{r, message = F, warning = F}
library(tidybayes)
get_hdi <- function(fit) {
fit %>%
posterior_samples() %>%
mode_hdi(b_Intercept) %>%
select(.lower:.upper)
}
# how does it work?
get_hdi(fit13.1)
```
Now we'll apply that function to our `fits` tibble to pull those simulated HDIs. Then we'll program in the markers for the ROPE and HDI width criteria, perform logical tests to see whether they were passes within each of the 100 simulations, and summarize the tests.
```{r, warning = F, message = F}
sim1 %>%
# get those HDIs and `unnest()`
mutate(hdi = map(fit, get_hdi)) %>%
unnest(hdi) %>%
# define our test criteria
mutate(rope_ll = .48,
rope_ul = .52,
hdi_width = .2) %>%
mutate(pass_rope = .lower > rope_ul | .upper < rope_ll,
pass_width = (.upper - .lower) < hdi_width) %>%
# summarize those joints
summarise(power_rope = mean(pass_rope),
power_width = mean(pass_width))
```
Those are our power estimates. To compute their HDIS, just increase them by a factor of 100 and plug them into the formulas within the shape arguments in `hdi_of_icdf()`.
```{r}
# HDIs for the ROPE power estimate
hdi_of_icdf(name = qbeta,
shape1 = 1 + 92,
shape2 = 1 + n_sim - 92) %>%
round(digits = 2)
# HDIs for the width power estimate
hdi_of_icdf(name = qbeta,
shape1 = 1 + 35,
shape2 = 1 + n_sim - 35) %>%
round(digits = 2)
```
Following the middle of page 375, we'll want to do the whole thing again with $\kappa = 10$ and $N = 91$.
Before we run the next simulation, notice how our first approach had us saving the model fits within our `sim1` object. When the models are simple and based on small data and when you’re only simulating 100 times, this isn't a huge deal. But saving 1000+ `brms::brm()` fit objects of hierarchical models will bog you down. So for our next simulation, we’ll only save the HDIs from our `get_hdi()` function.
```{r, eval = F}
# how many rows of 0s and 1s should we have in the data?
n <- 91
# how many simulations would you like?
n_sim <- 100
# specify omega and kappa of the hypothetical parameter distribution
omega <- .7
kappa <- 10
# make it reproducible
set.seed(13)
# simulate
sim2 <-
tibble(n = n,
size = size,
theta = rbeta(n_sim,
shape1 = omega * (kappa - 2) + 1,
shape2 = (1 - omega) * (kappa - 2) + 1)) %>%
mutate(data = pmap(list(n, size, theta), rbinom)) %>%
mutate(hdi = map(data, ~update(fit13.1, newdata = list(y = .)) %>% get_hdi()))
```
Since we saved the HDI estimates in the `hdi` column, we can just `unnest()` then and summarize our power results.
```{r}
sim2 %>%
unnest(hdi) %>%
mutate(rope_ll = .48,
rope_ul = .52,
hdi_width = .2) %>%
mutate(pass_rope = .lower > rope_ul | .upper < rope_ll,
pass_width = (.upper - .lower) < hdi_width) %>%
summarise(power_rope = mean(pass_rope),
power_width = mean(pass_width))
```
Compute the corresponding HDIs.
```{r}
# HDIs for the ROPE power estimate
hdi_of_icdf(name = qbeta,
shape1 = 1 + 71,
shape2 = 1 + n_sim - 71) %>%
round(digits = 2)
# HDIs for the width power estimate
hdi_of_icdf(name = qbeta,
shape1 = 1 + 91,
shape2 = 1 + n_sim - 91) %>%
round(digits = 2)
```
### Power from idealized or actual data.
> In practice, it is often more intuitive to specify actual or idealized *data* that express the hypothesis, than it is to specify top-level parameter properties. The idea is that we start with the actual or idealized data and then use Bayes' rule to generate the corresponding distribution on parameter values. (p. 376, *emphasis* in the original)
Here are the idealized parameters Kruschke outlied on pages 376--377.
```{r}
# specify idealized hypothesis:
ideal_group_mean <- 0.65
ideal_group_sd <- 0.07
ideal_n_subj <- 100 # more subjects => higher confidence in hypothesis
ideal_n_trl_per_subj <- 100 # more trials => higher confidence in hypothesis
```
These parameters are for binomial data. To parameterize $\theta$ in terms of a mean and standard deviation, we need to define the `beta_ab_from_mean_sd()` function.
```{r}
beta_ab_from_mean_sd <- function(mean, sd) {
if (mean <= 0 | mean >= 1) stop("must have 0 < mean < 1")
if (sd <= 0) stop("sd must be > 0")
kappa <- mean * (1 - mean) / sd^2 - 1
if (kappa <= 0) stop("invalid combination of mean and sd")
a <- mean * kappa
b <- (1.0 - mean) * kappa
return(list(a = a, b = b))
}
```
Now generate data consistent with these values using a **tidyverse**-style workflow.
```{r}
b <- beta_ab_from_mean_sd(ideal_group_mean, ideal_group_sd)
# make the results reproducible
set.seed(13)
d <-
# make a subject index and generate random theta values for idealized subjects
tibble(s = 1:ideal_n_subj,
theta = rbeta(ideal_n_subj, b$a, b$b)) %>%
# transform the theta values to exactly match idealized mean and SD
mutate(theta_transformed = ((theta - mean(theta)) / sd(theta)) * ideal_group_sd + ideal_group_mean) %>%
# `theta_transformed` must be between 0 and 1
mutate(theta_transformed = ifelse(theta_transformed >= 0.999, 0.999,
ifelse(theta_transformed <= 0.001, 0.001,
theta_transformed))) %>%
# generate idealized data very close to thetas
mutate(z = round(theta_transformed * ideal_n_trl_per_subj)) %>%
# create vector of 0's and 1's matching the z values generated above
mutate(y = map(z, ~c(rep(1, .), rep(0, ideal_n_trl_per_subj - .)))) %>%
unnest(y)
```
Our main variables are `s` and `y`. You can think of the rest as showing our work. Here's a peek.
```{r}
str(d)
```
We are going to follow the same procedure we did when we originally fit the model to the therapeutic touch data in Chapter 9. Instead of reproducing the model Kruschke presented in his scripts, we are going to fit a hierarchical logistic regression model.
```{r fit13.2}
fit13.2 <-
brm(data = d,
family = bernoulli(link = logit),
y ~ 1 + (1 | s),
prior = c(prior(normal(0, 1.5), class = Intercept),
prior(normal(0, 1), class = sd)),
iter = 2000, warmup = 1000, chains = 4, cores = 4,
seed = 13,
file = "fits/fit13.02")
```
Unlike in the text, we had no need for thinning our chains. The effective number of samples was fine.
```{r}
print(fit13.2)
```
Here's a look at our two main parameters, our version of the top panels of Figure 13.3.
```{r, fig.width = 6, fig.height = 2.5}
posterior_samples(fit13.2) %>%
pivot_longer(b_Intercept:sd_s__Intercept) %>%
ggplot(aes(x = value, y = 0)) +
stat_histintervalh(point_interval = mode_hdi, .width = .95,
fill = "grey67", slab_color = "grey92",
breaks = 40, slab_size = .25, outline_bars = T,
normalize = "panels") +
scale_y_continuous(NULL, breaks = NULL) +
labs(subtitle = "Remember, these are in the log-odds metric.",
x = NULL) +
theme(panel.grid = element_blank()) +
facet_wrap(~name, scales = "free")
```
> Now we have a distribution of parameter values consistent with our idealized hypothesis, but we did not have to figure out the top-level constants in the model. We merely specified the idealized tendencies in the data and expressed our confidence by its amount... So we now have a large set of representative parameter values for conducting a power analysis. (pp. 378--379)
With **brms**, you can sample from those model-implied parameter values with the `fitted()` function. By default, it will return values in the probability metric for our logistic model. Here we'll specify a group-level (i.e., `s`) value that was not in the data. We'll feed that new value into the `newdata` argument and set `allow_new_levels = T`. We'll also set `summary = F`, which will return actual probability values rather than a summary.
```{r}
set.seed(13)
f <-
fitted(fit13.2,
newdata = tibble(s = 0),
allow_new_levels = T,
summary = F) %>%
data.frame() %>%
set_names("theta")
str(f)
```
Here's what that looks like.
```{r, fig.width = 4, fig.height = 3}
f %>%
ggplot(aes(x = theta, y = 0)) +
stat_histintervalh(point_interval = mode_hdi, .width = .95,
fill = "grey67", slab_color = "grey92",
breaks = 20, slab_size = .2, outline_bars = T) +
scale_y_continuous(NULL, breaks = NULL) +
labs(subtitle = "Behold our \"distribution of parameter values consistent\nwith our idealized hypothesis.\"",
x = expression(theta)) +
xlim(0, 1) +
theme(panel.grid = element_blank())
```
We can make a custom function to sample from $\theta$. We might call it `sample_theta()`.
```{r}
sample_theta <- function(seed, n_subj) {
set.seed(seed)
bind_cols(s = 1:n_subj,
sample_n(f, size = n_subj, replace = T))
}
# take it for a spin
sample_theta(seed = 13, n_subj = 5)
```
Now let's say I wanted to use our little `sample_theta()` function to sample $\theta$ values for three people `s` and then use those $\theta$ values to sample three draws from the corresponding Bernoulli distribution. We might do that like this.
```{r}
sample_theta(seed = 13, n_subj = 3) %>%
mutate(y = map(theta, rbinom, n = 3, size = 1)) %>%
unnest(y)
```
Notice how after we sampled from $\theta$, we still needed to take two more steps to simulate the desired data. So perhaps a better approach would be to wrap all those steps into one function and call it something like `sample_data()`.
```{r}
sample_data <- function(seed, n_subj, n_trial) {
set.seed(seed)
bind_cols(s = 1:n_subj,
sample_n(f, size = n_subj, replace = T)) %>%
mutate(y = map(theta, rbinom, n = n_trial, size = 1)) %>%
unnest(y)
}
# test it out
sample_data(seed = 13, n_subj = 3, n_trial = 3)
```
Here's how we'd use our `sample_data()` function to make several data sets within the context of a nested tibble.
```{r}
tibble(seed = 1:4) %>%
mutate(data = map(seed, sample_data, n_subj = 14, n_trial = 47))
```
With this data type, Kruschke indicated he ran
> the power analysis twice, using different selections of subjects and trials. In both cases there [was] a total of 658 trials, but in the first case there [were] 14 subjects with 47 trials per subject, and in the second case there [were] seven subjects with 94 trials per subject. (p. 381)
Before running the simulations in full, we fit the model once and save that fit to iteratively reuse with `update()`.
```{r fit13.3}
# how many subjects should we have?
n_subj <- 14
# how many trials should we have?
n <- 47
# fit that joint
fit13.3 <-
brm(data = sample_theta(seed = 13, n_subj = 14) %>%
mutate(y = map(theta, rbinom, n = n, size = 1)) %>%
unnest(y),
family = bernoulli(link = logit),
y ~ 1 + (1 | s),
prior = c(prior(normal(0, 1.5), class = Intercept),
prior(normal(0, 1), class = sd)),
iter = 2000, warmup = 1000, chains = 4, cores = 4,
seed = 13,
control = list(adapt_delta = .9),
file = "fits/fit13.03")
```
Check real quick to make sure the fit turned out okay.
```{r}
print(fit13.3)
```
Looks fine. Our new model simulation carries with it some new goals.
> In this example, [Kruschke] considered goals for achieving precision and exceeding a ROPE around the null value, at both the group level and individual level. For the group level, the goals are for the 95% HDI on the group mode, $\omega$, to fall above the ROPE around the null value, and for the width of the HDI to be less than 0.2. For the individual level, the goals are for at least one of the $\theta_s$s 95% HDIs to exceed the ROPE with none that fall below the ROPE, and for all the $\theta_s$s 95% HDIs to have widths less than 0.2. (pp. 379--380)
Now since we used an aggregated binomial model, we don't have a population-level $\omega$ parameter. Rather, we have a population $\theta$. So like before, our first goal is for the population $\theta$ to fall above the range $[.48, .52]$. The second corresponding width goal is also like before; we want $\theta$ to have a width of less than 0.2. But since our aggregated binomial model parameterized $\theta$ in the log-odds metric, we'll have to update our `get_hdi()` function, which we'll strategically rename `get_theta_hdi()`.
```{r, warning = F, message = F}
get_theta_hdi <- function(fit) {
fit %>%
posterior_samples() %>%
transmute(theta = inv_logit_scaled(b_Intercept)) %>%
mode_hdi() %>%
select(.lower:.upper)
}
# how does it work?
get_theta_hdi(fit13.3)
```
As for the individual-level goals, the two Kruschke outlined in the text apply to our model in a straightforward way. But we will need one more custom function designed to pull the $\theta_s$s for the $\theta_s$s. Let's call this one `get_theta_s_hdi()`.
```{r get_theta_s_hdi, warning = F, message = F}
get_theta_s_hdi <- function(fit) {
n_col <-
coef(fit, summary = F)$s[, , "Intercept"] %>%
ncol()
coef(fit, summary = F)$s[, , "Intercept"] %>%
data.frame() %>%
set_names(1:n_col) %>%
mutate_all(inv_logit_scaled) %>%
gather(s, value) %>%
mutate(s = as.numeric(s)) %>%
group_by(s) %>%
mode_hdi(value) %>%
select(s, .lower:.upper) %>%
rename(.lower_s = .lower,
.upper_s = .upper)
}
# how does it work?
get_theta_s_hdi(fit13.3)
```
With `sim2`, we avoided saving our model `brms::brm()` fit objects by using `map(data, ~update(fit1, newdata = list(y = .)) %>% get_hdi())`. That is, within the `purrr::map()` function, we first used `update()` to update the fit to the new data and then pumped that directly into `get_hdi()`, which simply returned our intervals. Though slick, this approach won't work here because we want to pump our updated model fit into two functions, both `get_theta_hdi()` and `get_theta_s_hdi()`. Our work-around will be to make a custom function that updates the fit, saves it as an object, inserts that fit object into both `get_theta_hdi()` and `get_theta_s_hdi()`, binds their results together, and the only returns the intervals. We'll call this function `fit_then_hdis()`.
```{r}
fit_then_hdis <- function(data, seed) {
fit <- update(fit13.3,
newdata = data,
seed = seed)
cbind(get_theta_hdi(fit),
get_theta_s_hdi(fit))
}
```
Now we're ready to simulate.
```{r, eval = F}
# how many subjects should we have?
n_subj <- 14
# how many trials should we have?
n_trial <- 47
# how many simulations would you like?
n_sim <- 100
sim3 <-
tibble(seed = 1:n_sim) %>%
mutate(data = map(seed, sample_data, n_subj = n_subj, n_trial = n_trial)) %>%
mutate(hdi = map2(data, seed, fit_then_hdis))
```
If we hold these by the criteria of each $\text{HDI}_{\theta_s} > \text{ROPE}$ and all to have widths less than 0.2, It looks like our initial data-generating `fit13.3` is in the ballpark. Here are the results for the full power analysis, `sim3`.
```{r, eval = F}
sim3 <-
sim3 %>%
unnest(hdi) %>%
# here we determine whether we passed at the group level
mutate(pass_rope_theta = .lower > .52 | .upper < .48,