-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
predict_response removes unobserved dv levels from clmm? #519
Comments
Not sure if this might help figure out whether / what the bug is here, but since as I understand it predict_response() is calling emmeans, and I actually get a similar issue from emmeans, I went back to look at the FAQ for emmeans. I may be misunderstanding his explanation, but maybe the problem is that emmeans turns unobserved levels of an ordinal response into NAs? I can't quite tell if that's intended behavior for mode="mean.class", but if it is intentional, I'd suppose this is not a bug after all - though it might be good to have a warning. Here's the additional info: When I apply emmeans to the same ordinal model I named above, and set mode="mean.class" as below, , the mean.class estimates are almost uniformly 2s, as though they've also ignored the "no_definitely" level and dropped "yes_maybe" from level 3 to level2.
The emmeans FAQ I have in mind is below (I've bolded the parts that I take to be most relevant). It sounds like it's saying that using
|
Do you have a reproducible example? |
My apologies, I didn't realize what I gave up wasn't fully reproducible. I found the reprex package and made one using it, but let me know if this still isn't everything needed. I post this with the caveat that I may have just realized that if the problem is more likely to be with ordinal::clmm, but it may also just be my misunderstanding of how clmm's handle missing levels. Since I've asked you though, I suppose it makes more sense to have you confirm whether or not there's a bug with ggeffects than potentially come back to ask for your attention a second time later. library(tidyverse)
library(reprex)
library(ggeffects)
#> Warning: package 'ggeffects' was built under R version 4.3.3
library(emmeans)
library(ordinal)
#>
#> Attaching package: 'ordinal'
#> The following object is masked from 'package:dplyr':
#>
#> slice
reproExample<-tibble::tribble(
~subID, ~rType, ~response_rating,
6, "type4", "yes_maybe",
6, "type5", "yes_maybe",
6, "type3", "yes_maybe",
6, "type2", "yes_maybe",
6, "type1", "yes_maybe",
11, "type4", "yes_maybe",
11, "type5", "yes_maybe",
11, "type3", "yes_maybe",
11, "type2", "yes_maybe",
11, "type1", "yes_maybe",
16, "type4", "yes_definitely",
16, "type5", "yes_definitely",
16, "type3", "yes_definitely",
16, "type2", "yes_definitely",
16, "type1", "yes_definitely",
28, "type4", "yes_maybe",
28, "type5", "yes_maybe",
28, "type3", "yes_maybe",
28, "type2", "yes_maybe",
28, "type1", "yes_maybe",
33, "type4", "yes_maybe",
33, "type5", "yes_maybe",
33, "type3", "yes_maybe",
33, "type2", "yes_maybe",
33, "type1", "yes_maybe",
34, "type4", "yes_maybe",
34, "type5", "yes_maybe",
34, "type3", "yes_maybe",
34, "type2", "yes_maybe",
34, "type1", "yes_maybe",
36, "type4", "yes_maybe",
36, "type5", "yes_maybe",
36, "type3", "yes_maybe",
36, "type2", "yes_maybe",
36, "type1", "yes_maybe",
37, "type4", "yes_maybe",
37, "type5", "no_maybe",
37, "type3", "no_maybe",
37, "type2", "no_maybe",
37, "type1", "yes_maybe",
39, "type4", "yes_maybe",
39, "type5", "yes_maybe",
39, "type3", "yes_maybe",
39, "type2", "yes_maybe",
39, "type1", "yes_maybe",
40, "type4", "yes_maybe",
40, "type5", "yes_maybe",
40, "type3", "yes_maybe",
40, "type2", "yes_maybe",
40, "type1", "yes_maybe",
41, "type4", "yes_maybe",
41, "type5", "yes_maybe",
41, "type3", "yes_maybe",
41, "type2", "yes_maybe",
41, "type1", "yes_maybe",
46, "type4", "yes_maybe",
46, "type5", "yes_maybe",
46, "type3", "yes_maybe",
46, "type2", "yes_maybe",
46, "type1", "yes_maybe",
51, "type4", "yes_maybe",
51, "type5", "yes_maybe",
51, "type3", "yes_maybe",
51, "type2", "yes_maybe",
51, "type1", "yes_maybe",
52, "type4", "yes_definitely",
52, "type5", "yes_definitely",
52, "type3", "yes_definitely",
52, "type2", "yes_definitely",
52, "type1", "yes_definitely",
57, "type4", "yes_maybe",
57, "type5", "yes_maybe",
57, "type3", "yes_maybe",
57, "type2", "yes_maybe",
57, "type1", "yes_maybe",
60, "type4", "yes_maybe",
60, "type5", "yes_maybe",
60, "type3", "yes_maybe",
60, "type2", "yes_maybe",
60, "type1", "yes_maybe",
61, "type4", "yes_maybe",
61, "type5", "yes_maybe",
61, "type3", "yes_maybe",
61, "type2", "yes_maybe",
61, "type1", "yes_maybe",
63, "type4", "yes_maybe",
63, "type5", "yes_maybe",
63, "type3", "yes_maybe",
63, "type2", "yes_maybe",
63, "type1", "yes_maybe",
65, "type4", "yes_maybe",
65, "type5", "yes_maybe",
65, "type3", "yes_maybe",
65, "type2", "yes_maybe",
65, "type1", "yes_definitely",
68, "type4", "yes_maybe",
68, "type5", "yes_maybe",
68, "type3", "yes_maybe",
68, "type2", "yes_maybe",
68, "type1", "yes_maybe",
70, "type4", "yes_maybe",
70, "type5", "yes_maybe",
70, "type3", "yes_maybe",
70, "type2", "yes_maybe",
70, "type1", "yes_maybe",
71, "type4", "yes_maybe",
71, "type5", "yes_maybe",
71, "type3", "yes_maybe",
71, "type2", "yes_maybe",
71, "type1", "yes_maybe",
72, "type4", "yes_maybe",
72, "type5", "yes_maybe",
72, "type3", "yes_maybe",
72, "type2", "yes_maybe",
72, "type1", "yes_maybe",
76, "type4", "yes_maybe",
76, "type5", "yes_maybe",
76, "type3", "yes_maybe",
76, "type2", "yes_maybe",
76, "type1", "no_maybe",
77, "type4", "yes_maybe",
77, "type5", "yes_maybe",
77, "type3", "yes_maybe",
77, "type2", "yes_maybe",
77, "type1", "yes_maybe",
78, "type4", "no_maybe",
78, "type5", "no_maybe",
78, "type3", "no_maybe",
78, "type2", "no_maybe",
78, "type1", "no_maybe",
81, "type4", "yes_maybe",
81, "type5", "yes_maybe",
81, "type3", "yes_maybe",
81, "type2", "yes_maybe",
81, "type1", "yes_maybe",
82, "type4", "yes_maybe",
82, "type5", "yes_maybe",
82, "type3", "yes_maybe",
82, "type2", "yes_maybe",
82, "type1", "yes_maybe",
83, "type4", "yes_maybe",
83, "type5", "yes_maybe",
83, "type3", "yes_maybe",
83, "type2", "yes_maybe",
83, "type1", "yes_maybe",
85, "type4", "yes_maybe",
85, "type5", "yes_maybe",
85, "type3", "yes_maybe",
85, "type2", "yes_maybe",
85, "type1", "yes_maybe",
88, "type4", "yes_maybe",
88, "type5", "yes_maybe",
88, "type3", "yes_maybe",
88, "type2", "yes_maybe",
88, "type1", "yes_maybe",
90, "type4", "yes_maybe",
90, "type5", "yes_maybe",
90, "type3", "yes_maybe",
90, "type2", "yes_maybe",
90, "type1", "yes_maybe",
92, "type4", "yes_maybe",
92, "type5", "yes_maybe",
92, "type3", "yes_maybe",
92, "type2", "yes_maybe",
92, "type1", "yes_maybe",
93, "type4", "yes_maybe",
93, "type5", "yes_maybe",
93, "type3", "yes_maybe",
93, "type2", "yes_maybe",
93, "type1", "yes_maybe",
95, "type4", "yes_maybe",
95, "type5", "yes_maybe",
95, "type3", "yes_maybe",
95, "type2", "yes_maybe",
95, "type1", "yes_maybe",
96, "type4", "yes_maybe",
96, "type5", "yes_maybe",
96, "type3", "yes_maybe",
96, "type2", "yes_maybe",
96, "type1", "yes_maybe",
97, "type4", "yes_maybe",
97, "type5", "yes_maybe",
97, "type3", "yes_maybe",
97, "type2", "yes_maybe",
97, "type1", "yes_maybe",
98, "type4", "yes_maybe",
98, "type5", "yes_maybe",
98, "type3", "yes_maybe",
98, "type2", "yes_maybe",
98, "type1", "yes_maybe",
99, "type4", "yes_definitely",
99, "type5", "yes_definitely",
99, "type3", "yes_definitely",
99, "type2", "yes_definitely",
99, "type1", "yes_maybe",
102, "type4", "yes_maybe",
102, "type5", "yes_maybe",
102, "type3", "yes_maybe",
102, "type2", "yes_maybe",
102, "type1", "yes_maybe",
103, "type4", "yes_maybe",
103, "type5", "yes_maybe",
103, "type3", "yes_maybe",
103, "type2", "yes_maybe",
103, "type1", "yes_maybe",
108, "type4", "yes_definitely",
108, "type5", "yes_definitely",
108, "type3", "yes_definitely",
108, "type2", "yes_definitely",
108, "type1", "yes_definitely",
111, "type4", "yes_maybe",
111, "type5", "yes_maybe",
111, "type3", "yes_maybe",
111, "type2", "yes_maybe",
111, "type1", "yes_maybe",
112, "type4", "yes_maybe",
112, "type5", "yes_maybe",
112, "type3", "yes_maybe",
112, "type2", "yes_maybe",
112, "type1", "yes_maybe",
113, "type4", "yes_maybe",
113, "type5", "yes_maybe",
113, "type3", "yes_maybe",
113, "type2", "yes_maybe",
113, "type1", "yes_maybe",
115, "type4", "yes_maybe",
115, "type5", "yes_maybe",
115, "type3", "yes_maybe",
115, "type2", "yes_maybe",
115, "type1", "yes_maybe",
116, "type4", "yes_maybe",
116, "type5", "yes_maybe",
116, "type3", "yes_maybe",
116, "type2", "yes_maybe",
116, "type1", "yes_maybe",
117, "type4", "yes_maybe",
117, "type5", "yes_maybe",
117, "type3", "yes_maybe",
117, "type2", "yes_maybe",
117, "type1", "yes_maybe",
119, "type4", "yes_definitely",
119, "type5", "yes_definitely",
119, "type3", "yes_definitely",
119, "type2", "yes_definitely",
119, "type1", "yes_definitely",
127, "type4", "yes_maybe",
127, "type5", "yes_definitely",
127, "type3", "yes_definitely",
127, "type2", "yes_definitely",
127, "type1", "yes_maybe",
130, "type4", "yes_definitely",
130, "type5", "yes_definitely",
130, "type3", "yes_definitely",
130, "type2", "yes_definitely",
130, "type1", "yes_definitely",
134, "type4", "yes_maybe",
134, "type5", "yes_maybe",
134, "type3", "yes_maybe",
134, "type2", "yes_maybe",
134, "type1", "yes_maybe",
135, "type4", "yes_maybe",
135, "type5", "yes_maybe",
135, "type3", "yes_maybe",
135, "type2", "yes_maybe",
135, "type1", "yes_maybe",
148, "type4", "yes_maybe",
148, "type5", "yes_maybe",
148, "type3", "yes_maybe",
148, "type2", "yes_maybe",
148, "type1", "yes_maybe",
150, "type4", "yes_maybe",
150, "type5", "yes_maybe",
150, "type3", "yes_maybe",
150, "type2", "yes_maybe",
150, "type1", "yes_maybe",
152, "type4", "yes_definitely",
152, "type5", "yes_definitely",
152, "type3", "yes_definitely",
152, "type2", "yes_definitely",
152, "type1", "yes_definitely",
156, "type4", "yes_maybe",
156, "type5", "yes_maybe",
156, "type3", "yes_maybe",
156, "type2", "yes_maybe",
156, "type1", "yes_maybe",
161, "type4", "yes_maybe",
161, "type5", "yes_maybe",
161, "type3", "yes_maybe",
161, "type2", "yes_maybe",
161, "type1", "no_maybe",
168, "type4", "no_maybe",
168, "type5", "yes_maybe",
168, "type3", "yes_maybe",
168, "type2", "no_maybe",
168, "type1", "yes_maybe",
169, "type4", "yes_definitely",
169, "type5", "yes_definitely",
169, "type3", "yes_definitely",
169, "type2", "yes_definitely",
169, "type1", "yes_definitely",
173, "type4", "yes_maybe",
173, "type5", "yes_maybe",
173, "type3", "yes_maybe",
173, "type2", "yes_maybe",
173, "type1", "yes_definitely",
174, "type4", "yes_maybe",
174, "type5", "yes_maybe",
174, "type3", "yes_maybe",
174, "type2", "yes_maybe",
174, "type1", "yes_maybe",
180, "type4", "yes_definitely",
180, "type5", "yes_definitely",
180, "type3", "yes_definitely",
180, "type2", "yes_definitely",
180, "type1", "yes_definitely",
182, "type4", "yes_maybe",
182, "type5", "yes_maybe",
182, "type3", "yes_maybe",
182, "type2", "yes_maybe",
182, "type1", "no_maybe",
186, "type4", "yes_maybe",
186, "type5", "yes_maybe",
186, "type3", "yes_maybe",
186, "type2", "yes_maybe",
186, "type1", "yes_maybe",
188, "type4", "no_maybe",
188, "type5", "no_maybe",
188, "type3", "no_maybe",
188, "type2", "no_maybe",
188, "type1", "no_maybe",
189, "type4", "yes_maybe",
189, "type5", "yes_maybe",
189, "type3", "yes_maybe",
189, "type2", "yes_maybe",
189, "type1", "yes_maybe",
191, "type4", "yes_maybe",
191, "type5", "yes_maybe",
191, "type3", "yes_maybe",
191, "type2", "yes_maybe",
191, "type1", "yes_maybe",
192, "type4", "yes_maybe",
192, "type5", "yes_definitely",
192, "type3", "yes_maybe",
192, "type2", "yes_maybe",
192, "type1", "yes_maybe",
193, "type4", "yes_maybe",
193, "type5", "yes_maybe",
193, "type3", "yes_maybe",
193, "type2", "yes_maybe",
193, "type1", "yes_maybe",
198, "type4", "yes_maybe",
198, "type5", "yes_maybe",
198, "type3", "yes_maybe",
198, "type2", "yes_maybe",
198, "type1", "yes_maybe",
203, "type4", "yes_maybe",
203, "type5", "yes_maybe",
203, "type3", "yes_maybe",
203, "type2", "yes_maybe",
203, "type1", "yes_maybe",
206, "type4", "yes_maybe",
206, "type5", "yes_maybe",
206, "type3", "yes_maybe",
206, "type2", "yes_maybe",
206, "type1", "yes_maybe"
)%>%
mutate(rType = factor(rType, levels = c("type1", "type2", "type3", "type4", "type5")),
response_rating = factor(response_rating, levels = c("no_definitely", "no_maybe", "yes_maybe", "yes_definitely")))
# here's what the data looks like
glimpse(reproExample)
#> Rows: 365
#> Columns: 3
#> $ subID <dbl> 6, 6, 6, 6, 6, 11, 11, 11, 11, 11, 16, 16, 16, 16, 16,…
#> $ rType <fct> type4, type5, type3, type2, type1, type4, type5, type3…
#> $ response_rating <fct> yes_maybe, yes_maybe, yes_maybe, yes_maybe, yes_maybe,…
# The factor DOES encode all 4 levels
levels(reproExample$response_rating)
#> [1] "no_definitely" "no_maybe" "yes_maybe" "yes_definitely"
# but there are no observations of "no_definitely": note that since yes_maybe = 3 on the 4 point scale,
# these talleis imply the mean of the data should be close to 3
reproExample%>%
group_by(response_rating)%>%
tally()
#> # A tibble: 3 × 2
#> response_rating n
#> <fct> <int>
#> 1 no_maybe 18
#> 2 yes_maybe 297
#> 3 yes_definitely 50
# I only disovered that the level was being dropped despite showing up in levels(reproExample$response_rating)
# when ggeffects::predict_response didn't output any predictions for no_definitely
problemDiscoveryPart1_noLevel1<-predict_response(clmm(response_rating ~ rType + (1|subID), data=reproExample),
terms=c("rType"), margin = "marginalmeans")
problemDiscoveryPart1_noLevel1
#> # Predicted probabilities of response_rating
#>
#> response_rating: 1
#>
#> rType | Predicted | 95% CI
#> -------------------------------
#> type1 | 0.00 | 0.00, 0.00
#> type2 | 0.00 | 0.00, 0.00
#> type3 | 0.00 | 0.00, 0.00
#> type4 | 0.00 | 0.00, 0.00
#> type5 | 0.00 | 0.00, 0.00
#>
#> response_rating: 2
#>
#> rType | Predicted | 95% CI
#> -------------------------------
#> type1 | 1.00 | 1.00, 1.00
#> type2 | 1.00 | 1.00, 1.00
#> type3 | 1.00 | 1.00, 1.00
#> type4 | 1.00 | 1.00, 1.00
#> type5 | 1.00 | 1.00, 1.00
#>
#> response_rating: 3
#>
#> rType | Predicted | 95% CI
#> -------------------------------
#> type1 | 0.00 | 0.00, 0.00
#> type2 | 0.00 | 0.00, 0.00
#> type3 | 0.00 | 0.00, 0.00
#> type4 | 0.00 | 0.00, 0.00
#> type5 | 0.00 | 0.00, 0.00
# but it turns out emmeans also doesn't seem to use no_definitely, as the mean is close to 2 instead of 3
problemDiscoveryPart2_alsoNoLevel1<-emmeans(clmm(response_rating ~ rType + (1|subID), data=reproExample), ~ rType, mode="mean.class")
problemDiscoveryPart2_alsoNoLevel1
#> rType mean.class SE df asymp.LCL asymp.UCL
#> type1 1.99999 4.06977e-05 Inf 1.99991 2.00007
#> type2 2.00001 3.61404e-05 Inf 1.99994 2.00008
#> type3 2.00003 5.12961e-05 Inf 1.99993 2.00013
#> type4 2.00001 3.61405e-05 Inf 1.99994 2.00008
#> type5 2.00005 8.38010e-05 Inf 1.99989 2.00022
#>
#> Confidence level used: 0.95
# I may have discovered that the problem is actually clmm? I'm now not sure if this is a bug or my misunderstanding of the function
problemDiscoveryPart3_alsoNoLevel1<-clmm(response_rating ~ rType + (1|subID), data=reproExample)
summary(problemDiscoveryPart3_alsoNoLevel1)
#> Cumulative Link Mixed Model fitted with the Laplace approximation
#>
#> formula: response_rating ~ rType + (1 | subID)
#> data: reproExample
#>
#> link threshold nobs logLik AIC niter max.grad cond.H
#> logit flexible 365 -75.62 165.24 330(3819) 4.96e-05 1.4e+02
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> subID (Intercept) 261.2 16.16
#> Number of groups: subID 73
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> rTypetype2 0.5559 1.0593 0.525 0.600
#> rTypetype3 1.1149 1.0752 1.037 0.300
#> rTypetype4 0.5559 1.0593 0.525 0.600
#> rTypetype5 1.6810 1.1040 1.523 0.128
#>
#> Threshold coefficients:
#> Estimate Std. Error z value
#> no_maybe|yes_maybe -10.80 1.87 -5.772
#> yes_maybe|yes_definitely 11.44 1.65 6.933 Created on 2024-05-18 with reprex v2.0.2 |
Hi!
I'm running predict_response on a clmm. The dv, response_rating, has 4 levels:
no_definitely = 1
no_maybe = 2
yes_maybe = 3
yes_definitely = 4.
But, "no_definitely" never actually appears in the data (almost all responses are "yes_maybe"). The "no_definitely" level is still listed as one of the factor levels, so I'd expect predict_response to just assign it a 0 probability or something (which is what it does for the other low-observation-frequency levels).
But instead, predict response drops it entirely, which means that not only does it not get a prediction, but the response.levels are reassigned so that "no_maybe" becomes the 1. I think I would have expected it to just list the predicted probabilities for no_definitely as 0, and I certainly wouldn't have expected it to reassign response-level numbers in an ordinal model since my understanding is that it would need those numbers to apply them correctly to the cut thresholds?
I can't find anything in the documentation or stackoverflow, but an issue was raised here for brms that made me think it could just be an issue with how predict_response handles unobserved levels?
Here's what the data looks like (I think this is reproducible, but let me know if something's missing - I don't think I've ever actually posted a question like this before).
Thanks much!
The text was updated successfully, but these errors were encountered: