Skip to content

Commit 44dbd81

Browse files
Add olre rpubs tutorial
1 parent 034b500 commit 44dbd81

File tree

3 files changed

+242
-0
lines changed

3 files changed

+242
-0
lines changed

content/tutorials/r_olre/index.Rmd

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
title: "Observation level random effects"
3+
description: ""
4+
author: "Thierry Onkelinx"
5+
date: "`r Sys.Date()`"
6+
categories: ["r", "statistics"]
7+
tags: ["r", "analysis", "mixed model"]
8+
output:
9+
md_document:
10+
preserve_yaml: true
11+
variant: gfm
12+
---
13+
14+
```{r setup, include=FALSE}
15+
knitr::opts_chunk$set(echo = TRUE)
16+
```
17+
18+
## Definition
19+
20+
An observation level random effect (OLRE) is a random effect with a different level for each observation.
21+
Combined with a Poisson distribution it can capture moderate overdispersion.
22+
23+
## Example
24+
25+
The model below is a simple mixed model.
26+
$\delta_{ijk}$ is the OLRE.
27+
28+
$$Y_{ijk} \sim Poisson(\mu_{ijk})$$
29+
30+
$$\log(\mu_{ijk}) = \eta_{ijk}$$
31+
32+
$$\eta_{ijk} = \beta_0 + \beta_1 X_i + b_j + \delta_{ijk}$$
33+
34+
$$b_j \sim N(0, \sigma_b)$$
35+
36+
$$\delta_{ijk} \sim N(0, \sigma_{olre})$$
37+
38+
Let's generate some overdispersed data using a negative binomial distribution.
39+
40+
```{r echo = TRUE}
41+
set.seed(324)
42+
n.i <- 10
43+
n.j <- 10
44+
n.k <- 10
45+
beta.0 <- 1
46+
beta.1 <- 0.3
47+
sigma.b <- 0.5
48+
theta <- 5
49+
dataset <- expand.grid(
50+
X = seq_len(n.i),
51+
b = seq_len(n.j),
52+
Replicate = seq_len(n.k)
53+
)
54+
rf.b <- rnorm(n.j, mean = 0, sd = sigma.b)
55+
dataset$eta <- beta.0 + beta.1 * dataset$X + rf.b[dataset$b]
56+
dataset$mu <- exp(dataset$eta)
57+
dataset$Y <- rnbinom(nrow(dataset), mu = dataset$mu, size = theta)
58+
dataset$OLRE <- seq_len(nrow(dataset))
59+
```
60+
61+
```{r}
62+
library(ggplot2)
63+
ggplot(dataset, aes(x = X, y = Y, group = Replicate)) +
64+
geom_line() +
65+
geom_point() +
66+
facet_wrap(~b)
67+
```
68+
69+
Next we fit the model with an OLRE.
70+
71+
```{r}
72+
library(lme4)
73+
m1 <- glmer(Y ~ X + (1 | b) + (1 | OLRE),
74+
data = dataset,
75+
family = poisson
76+
)
77+
summary(m1)
78+
```
79+
80+
The estimates for $\beta_0$, $\beta_1$ and $\sigma_b$ are reasonably close to the ones used to generate the data.
81+
82+
## Check the sanity of the model
83+
84+
Models with an OLRE should be used carefully.
85+
Because OLRE can have a very strong influence on the model.
86+
One should always check the standard deviation of the ORLE.
87+
High standard deviations are a good indication for problems.
88+
Let's show why.
89+
90+
The main model (without OLRE) is $\gamma_{ijk} = \beta_0 + \beta_1 X_i + b_j$.
91+
The OLRE correct this for overdispersion $\eta_{ijk} = \gamma_{ijk} + \delta_{ijk}$.
92+
Since $\delta_{ijk} \sim N(0, \sigma_{olre})$, the 2.5% and 97.5% quantiles of this distribution define a range in which 95% of the plausible OLRE values are.
93+
The 2.5% quantile ($\delta_{lcl}$) indicates a strong but plausible downward correction, the 97.5% quantile ($\delta_{ucl}$) an upward correction.
94+
So due to the corrections we have $\eta_{ijk} = (\gamma_{ijk} + \delta_{lcl}; \gamma_{ijk} + \delta_{ucl}) = \gamma_{ijk} + (\delta_{lcl};\delta_{ucl})$.
95+
In a Poisson distribution with log-link we have $\log(\mu_{ijk}) = \eta_{ijk}$ or $\mu_{ijk} = e ^{\eta_{ijk}} = e ^{\gamma_{ijk} + (\delta_{lcl};\delta_{ucl})}$.
96+
The ratio $r_{ijk}$ between the upper and lower boundary of $\mu_{ijk}$ is:
97+
98+
$$r_{ijk} = \frac{e ^{\gamma_{ijk} + \delta_{ucl}}}{e ^{\gamma_{ijk} + \delta_{lcl})}}$$

content/tutorials/r_olre/index.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
---
2+
title: "Observation level random effects"
3+
description: ""
4+
author: "Thierry Onkelinx"
5+
date: "2020-07-01"
6+
categories: ["r", "statistics"]
7+
tags: ["r", "analysis", "mixed model"]
8+
output:
9+
md_document:
10+
preserve_yaml: true
11+
variant: gfm
12+
---
13+
14+
## Definition
15+
16+
An observation level random effect (OLRE) is a random effect with a
17+
different level for each observation. Combined with a Poisson
18+
distribution it can capture moderate overdispersion.
19+
20+
## Example
21+
22+
The model below is a simple mixed model. \(\delta_{ijk}\) is the OLRE.
23+
24+
\[Y_{ijk} \sim Poisson(\mu_{ijk})\]
25+
26+
\[\log(\mu_{ijk}) = \eta_{ijk}\]
27+
28+
\[\eta_{ijk} = \beta_0 + \beta_1 X_i + b_j + \delta_{ijk}\]
29+
30+
\[b_j \sim N(0, \sigma_b)\]
31+
32+
\[\delta_{ijk} \sim N(0, \sigma_{olre})\]
33+
34+
Let’s generate some overdispersed data using a negative binomial
35+
distribution.
36+
37+
``` r
38+
set.seed(324)
39+
n.i <- 10
40+
n.j <- 10
41+
n.k <- 10
42+
beta.0 <- 1
43+
beta.1 <- 0.3
44+
sigma.b <- 0.5
45+
theta <- 5
46+
dataset <- expand.grid(
47+
X = seq_len(n.i),
48+
b = seq_len(n.j),
49+
Replicate = seq_len(n.k)
50+
)
51+
rf.b <- rnorm(n.j, mean = 0, sd = sigma.b)
52+
dataset$eta <- beta.0 + beta.1 * dataset$X + rf.b[dataset$b]
53+
dataset$mu <- exp(dataset$eta)
54+
dataset$Y <- rnbinom(nrow(dataset), mu = dataset$mu, size = theta)
55+
dataset$OLRE <- seq_len(nrow(dataset))
56+
```
57+
58+
``` r
59+
library(ggplot2)
60+
```
61+
62+
## Warning: package 'ggplot2' was built under R version 4.0.2
63+
64+
``` r
65+
ggplot(dataset, aes(x = X, y = Y, group = Replicate)) +
66+
geom_line() +
67+
geom_point() +
68+
facet_wrap(~b)
69+
```
70+
71+
![](index_files/figure-gfm/unnamed-chunk-2-1.png)<!-- -->
72+
73+
Next we fit the model with an OLRE.
74+
75+
``` r
76+
library(lme4)
77+
```
78+
79+
## Loading required package: Matrix
80+
81+
``` r
82+
m1 <- glmer(Y ~ X + (1 | b) + (1 | OLRE),
83+
data = dataset,
84+
family = poisson
85+
)
86+
summary(m1)
87+
```
88+
89+
## Generalized linear mixed model fit by maximum likelihood (Laplace
90+
## Approximation) [glmerMod]
91+
## Family: poisson ( log )
92+
## Formula: Y ~ X + (1 | b) + (1 | OLRE)
93+
## Data: dataset
94+
##
95+
## AIC BIC logLik deviance df.resid
96+
## 6871.8 6891.4 -3431.9 6863.8 996
97+
##
98+
## Scaled residuals:
99+
## Min 1Q Median 3Q Max
100+
## -2.06918 -0.41636 -0.00649 0.27116 1.55094
101+
##
102+
## Random effects:
103+
## Groups Name Variance Std.Dev.
104+
## OLRE (Intercept) 0.1732 0.4161
105+
## b (Intercept) 0.1475 0.3840
106+
## Number of obs: 1000, groups: OLRE, 1000; b, 10
107+
##
108+
## Fixed effects:
109+
## Estimate Std. Error z value Pr(>|z|)
110+
## (Intercept) 1.002971 0.127698 7.854 4.02e-15 ***
111+
## X 0.298558 0.005803 51.447 < 2e-16 ***
112+
## ---
113+
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
114+
##
115+
## Correlation of Fixed Effects:
116+
## (Intr)
117+
## X -0.282
118+
119+
The estimates for \(\beta_0\), \(\beta_1\) and \(\sigma_b\) are
120+
reasonably close to the ones used to generate the data.
121+
122+
## Check the sanity of the model
123+
124+
Models with an OLRE should be used carefully. Because OLRE can have a
125+
very strong influence on the model. One should always check the standard
126+
deviation of the ORLE. High standard deviations are a good indication
127+
for problems. Let’s show why.
128+
129+
The main model (without OLRE) is
130+
\(\gamma_{ijk} = \beta_0 + \beta_1 X_i + b_j\). The OLRE correct this
131+
for overdispersion \(\eta_{ijk} = \gamma_{ijk} + \delta_{ijk}\). Since
132+
\(\delta_{ijk} \sim N(0, \sigma_{olre})\), the 2.5% and 97.5% quantiles
133+
of this distribution define a range in which 95% of the plausible OLRE
134+
values are. The 2.5% quantile (\(\delta_{lcl}\)) indicates a strong but
135+
plausible downward correction, the 97.5% quantile (\(\delta_{ucl}\)) an
136+
upward correction. So due to the corrections we have
137+
\(\eta_{ijk} = (\gamma_{ijk} + \delta_{lcl}; \gamma_{ijk} + \delta_{ucl}) = \gamma_{ijk} + (\delta_{lcl};\delta_{ucl})\).
138+
In a Poisson distribution with log-link we have
139+
\(\log(\mu_{ijk}) = \eta_{ijk}\) or
140+
\(\mu_{ijk} = e ^{\eta_{ijk}} = e ^{\gamma_{ijk} + (\delta_{lcl};\delta_{ucl})}\).
141+
The ratio \(r_{ijk}\) between the upper and lower boundary of
142+
\(\mu_{ijk}\) is:
143+
144+
\[r_{ijk} = \frac{e ^{\gamma_{ijk} + \delta_{ucl}}}{e ^{\gamma_{ijk} + \delta_{lcl})}}\]
10.5 KB
Loading

0 commit comments

Comments
 (0)