You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think brm is dropping coefficients (with a cryptic warning), then misnaming the remaining coefficients in some multivariate models. This behavior shows up in models with categorical predictors (or interactions) when a category doesn't have any observations in the subset used for a given response variable. This is a lot like the model in #360.
I expected to see all of the coefficients in the model summary (the same as those returned by get_prior). When there are no data for a particular category, I expected samples from the prior to be returned. Instead, I think the coefficients for categories that have no observations in the subset are being dropped. The warning traces to rename_pars: In x$fit@sim$fnames_oi[change$pos] <- change$fnames : number of items to replace is not a multiple of replacement length
More concerning, I think the remaining coefficients are being misnamed. This is easiest to see with an example.
Reproducible example
Two Gaussian response variables (resp1 and resp2), one continuous predictor (pred), one categorical with five levels (trt). The model fits an intercept and slope for each level of trt. Each resp* uses it's own (overlapping) subset of observations. Some fraction of the data are missing. This is the simplest model I can get to reproduce the problem.
library(brms)
set.seed(7429)
dat <- data.frame(s1 = rep(c(TRUE, FALSE), each = 20),
s2 = rep(c(TRUE, FALSE), times = 20),
trt = sample(letters[1:5], size = 40, replace = TRUE),
resp1 = rnorm(40),
resp2 = rnorm(40),
pred = rnorm(40))
dat$resp1[sample(1:nrow(dat), size = 10)] <- NA_real_
dat$resp2[sample(1:nrow(dat), size = 10)] <- NA_real_
pr <- prior(normal(0, 1), class = "Intercept", resp = "resp1")+
prior(normal(0, 1), class = "b", resp = "resp1")+
prior(student_t(3, 0, 1), class = "sigma", resp = "resp1")+
prior(normal(0, 1), class = "Intercept", resp = "resp2")+
prior(normal(0, 1), class = "b", resp = "resp2")+
prior(student_t(3, 0, 1), class = "sigma", resp = "resp2")
# fit multivariate regression with treatment intercepts & slopes
mod <- brm(bf(resp1 | subset(s1) ~ pred * trt)+
bf(resp2 | subset(s2) ~ pred * trt)+
set_rescor(FALSE),
prior = pr,
data = dat)
fixef(mod)
Notice that b_resp2_pred:trtd and b_resp2_pred:trte are missing from the coefficients. But the it's treatment trtc and trte that have missing observations. I think the parameter renaming step is filling the treatment coefficients alphabetically, but running out.
> xtabs(~is.na(resp2) + trt, data = dat)
trt
is.na(resp2) a b c d e
FALSE 4 8 5 6 7
TRUE 3 2 0 5 0
Additional clues
The expected behavior (= all coefficients in the summary) shows up when:
Reducing to a univariate model with the same data, formula, etc.
mod2 <- brm(resp2 | subset(s2) ~ pred * trt,
prior = prior(normal(0, 1), class = "Intercept")+
prior(normal(0, 1), class = "b")+
prior(student_t(3, 0, 1), class = "sigma"),
data = dat)
All categories have >=1 observation in the subset used for the affected response. To see this, re-simulate the data after set.seed(7430) and re-fit the multivariate model.
Your work on this package has made so much more possible in my research. Thank you!
The text was updated successfully, but these errors were encountered:
Problem
I think
brm
is dropping coefficients (with a cryptic warning), then misnaming the remaining coefficients in some multivariate models. This behavior shows up in models with categorical predictors (or interactions) when a category doesn't have any observations in the subset used for a given response variable. This is a lot like the model in #360.I expected to see all of the coefficients in the model summary (the same as those returned by
get_prior
). When there are no data for a particular category, I expected samples from the prior to be returned. Instead, I think the coefficients for categories that have no observations in the subset are being dropped. The warning traces torename_pars
:In x$fit@sim$fnames_oi[change$pos] <- change$fnames : number of items to replace is not a multiple of replacement length
More concerning, I think the remaining coefficients are being misnamed. This is easiest to see with an example.
Reproducible example
Two Gaussian response variables (
resp1
andresp2
), one continuous predictor (pred
), one categorical with five levels (trt
). The model fits an intercept and slope for each level oftrt
. Eachresp*
uses it's own (overlapping) subset of observations. Some fraction of the data are missing. This is the simplest model I can get to reproduce the problem.Notice that
b_resp2_pred:trtd
andb_resp2_pred:trte
are missing from the coefficients. But the it's treatmenttrtc
andtrte
that have missing observations. I think the parameter renaming step is filling the treatment coefficients alphabetically, but running out.Additional clues
The expected behavior (= all coefficients in the summary) shows up when:
set.seed(7430)
and re-fit the multivariate model.Your work on this package has made so much more possible in my research. Thank you!
The text was updated successfully, but these errors were encountered: