-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replicating ggpredict()
with predict()
#360
Comments
Predictions for zero inflated models are always returned on the response scale. The related confidence intervals, however, are based on simulations. See here and Brooks et al But currently, this approach is under revision, at least for glmmtmb. I'm currently traveling, so only short answer. |
library(pscl)
library(marginaleffects)
library(ggeffects)
dat = structure(list(Subject = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 60, 61,
62, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78,
79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 90, 91, 92, 93, 94, 95,
96, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109,
110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,
123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,
149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161,
162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174,
175, 176, 177, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188,
189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199), x = c(23,
26, 33, 29, 26, 25, 25, 28, 32, 28, 29, 31, 21, 35, 24, 22, 29,
33, 31, 25, 34, 31, 22, 18, 32, 31, 22, 35, 28, 35, 30, 25, 25,
27, 31, 27, 31, 39, 24, 30, 34, 30, 33, 27, 18, 33, 26, 31, 28,
30, 24, 26, 19, 24, 39, 28, 33, 29, 29, 31, 12, 40, 33, 30, 30,
32, 30, 25, 28, 34, 23, 27, 27, 38, 30, 33, 31, 25, 26, 25, 31,
31, 32, 30, 29, 37, 24, 26, 37, 25, 20, 31, 26, 33, 33, 27, 27,
22, 20, 29, 24, 29, 27, 24, 30, 32, 24, 30, 29, 30, 28, 22, 28,
27, 26, 24, 26, 28, 29, 32, 25, 28, 28, 33, 30, 21, 27, 33, 29,
21, 26, 33, 33, 24, 33, 33, 31, 33, 28, 37, 29, 25, 28, 29, 41,
28, 30, 25, 31, 35, 23, 35, 27, 31, 23, 30, 28, 31, 22, 28, 25,
23, 29, 30, 21, 30, 41, 28, 36, 32, 37, 27, 23, 33, 24, 36, 31,
27, 33, 34, 35, 30, 30, 27, 27, 34, 36, 19, 33, 32, 25, 32, 27
), z = c(70.46, 83.89, 67.42, 104.57, 119.19,
52.93, 88.74, 53.75, 80.72, 70.14, 50.94, 87.73, 99.31, 87.69,
77.45, 90.41, 68.7, 94.02, 45.59, 70.28, 87.11, 94.11, 69.49,
41.99, 94.18, 57.94, 57.18, 95.37, 47.78, 93.5, 86.28, 97.97,
90.16, 89.01, 77.77, 53.15, 102.49, 104.58, 91.7, 91.64, 92.96,
93.36, 98.4, 109.39, 44.37, 50.79, 76.25, 81.32, 94.34, 76.66,
57.61, 83.24, 93.43, 59.24, 65.69, 94.37, 92.21, 92.72, 48.3,
78.65, 94.93, 92.33, 80.66, 65.85, 91.99, 85.82, 95.77, 83.65,
78.06, 93.11, 85.11, 61.33, 70.99, 97.38, 83.67, 81.34, 82.92,
103.1, 45.08, 77.64, 96.58, 57.1, 52.77, 40.59, 82.16, 40.62,
52.43, 44.77, 93.89, 95.18, 86.33, 55.67, 92.75, 71.19, 54.34,
96.2, 77.59, 69.71, 107.29, 83.38, 101.98, 64.54, 87.14, 63.51,
109.1, 89.68, 93.06, 74.91, 65.96, 93.49, 60.39, 35.44, 86.24,
92.8, 62.21, 91.13, 89.97, 57.97, 90.5, 84.36, 84.46, 75.68,
67.19, 65.98, 91.15, 51.78, 38.87, 77.26, 56.89, 76.78, 45.34,
80.92, 92.76, 101.21, 82.55, 53.12, 89.12, 52.03, 54.68, 73.25,
73.09, 93.29, 89.01, 57.14, 83.94, 93.31, 81.17, 89.39, 82.38,
65.19, 109.35, 89.68, 83.94, 102.96, 92.75, 81.65, 65.74, 62.3,
87.6, 101.99, 103.41, 101.25, 75.94, 104.26, 83.23, 49.33, 98.06,
81.44, 96.57, 94.94, 102.42, 102.83, 95.46, 91.62, 41.27, 74.82,
94.93, 84.2, 88.06, 77.49, 65.99, 46.02, 90.97, 78.3, 94.52,
93.04, 69.65, 53.39, 71.94, 75.18, 67.98, 48.61, 80.52), y = c(7,
8, 3, 0, 0, 4, 10, 0, 4, 0, 9, 2, 5, 0, 4, 5, 4, 3, 0, 4, 0,
6, 2, 5, 0, 2, 2, 0, 2, 2, 4, 5, 2, 4, 0, 0, 8, 3, 3, 4, 2, 2,
0, 2, 15, 0, 3, 6, 3, 4, 3, 11, 4, 3, 2, 5, 0, 0, 2, 2, 6, 0,
3, 0, 3, 0, 6, 0, 0, 0, 6, 7, 3, 0, 0, 3, 7, 5, 4, 3, 0, 0, 4,
0, 6, 7, 15, 14, 0, 3, 4, 0, 5, 0, 2, 2, 6, 8, 4, 3, 0, 5, 5,
3, 0, 0, 6, 0, 0, 6, 14, 17, 2, 5, 5, 5, 0, 2, 2, 3, 4, 2, 7,
6, 3, 5, 2, 0, 5, 9, 5, 5, 0, 14, 2, 4, 0, 4, 2, 3, 7, 2, 3,
0, 0, 9, 5, 7, 4, 0, 7, 4, 4, 0, 9, 4, 0, 2, 4, 2, 3, 2, 1, 2,
7, 8, 2, 4, 4, 5, 12, 5, 5, 2, 14, 1, 4, 5, 1, 8, 2, 1, 5, 5,
1, 0, 1, 10, 1, 8, 2, 1, 9), Family_ID = c("52259_82122", "56037_85858",
"51488_81352", "51730_81594", "52813_82634", "51283_52850_81149",
"51969_81833", "51330_81195", "52385_82248", "52198_82061", "51279_81145",
"51977_81841", "52018_81882", "52901_82723", "51679_81543", "56077_85897",
"52838_82659", "51469_81334", "51418_81283", "55895_85715", "51351_81216",
"56105_85925", "52198_82061", "51537_52707_81401", "51343_81208",
"51678_81542", "55810_85631", "54643_84465", "51283_52850_81149",
"51451_81316", "51820_81684", "51527_81391", "55695_85517", "52925_82747",
"52134_81998", "52321_82184", "51527_81391", "51468_81333", "56183_86002",
"51676_81540", "52941_82763", "55705_85527", "52493_82328_82671",
"54628_84450", "56022_85843_99973", "52921_82743", "56200_86019",
"51566_81430", "52925_82747", "51553_81417", "52586_99991_99992_99993",
"52054_81918", "51750_81614", "52270_82133", "54572_84394", "51303_81168",
"56094_85914", "55741_85562", "51381_81246", "51476_81340", "52076_81940",
"52662_82481", "53251_83073", "51340_81205", "51451_81316", "51798_81662",
"51424_81289", "55895_85715", "52332_82195", "51750_81614", "52110_81974",
"51752_81616", "51433_81298", "55703_85525", "51989_81853", "52872_82694",
"53303_83125", "99996_99997", "51354_81219_82525_82526", "52158_82021",
"52823_82644", "51305_81170", "53251_83073", "56016_85837_99974",
"55892_85712", "52586_99991_99992_99993", "52757_82578", "52769_82590_99986",
"55793_85614", "52345_82208", "51455_81320", "51529_81393_82672",
"56120_85940", "51676_81540", "51421_81286", "52813_82634", "52514_52849_82348_99995",
"51419_81284", "56187_86006", "51295_81161", "56034_85855", "52063_81927",
"55701_85523", "51486_81350", "53171_82993", "51476_81340", "51324_81189",
"51618_81482", "52468_82308", "55706_85528", "55961_85781", "52410_82265_99979",
"51945_81809", "52907_82729", "51673_81537", "51421_81286", "51673_81537",
"51802_81666", "51592_81456", "51782_81646", "51392_81257", "51318_81183",
"52496_82331", "51916_81780", "56073_85894", "51106_80975", "56022_85843_99973",
"51644_81508_99994", "51485_81349", "52761_82582_99976", "51529_81393_82672",
"51798_81662", "51351_81216", "51837_81701", "56094_85914", "51304_81169",
"51477_81341", "55646_85468", "52338_82201", "51294_81160", "55892_85712",
"52228_82091", "52526_82359", "51529_81393_82672", "51300_81166",
"51831_81695", "56150_85970", "51593_81457", "52875_82697", "51347_81212",
"51707_81571", "51833_81697", "51678_81542", "52194_82057", "52110_81974",
"51299_81165", "52069_81933", "55705_85527", "51478_81342", "55766_85587",
"51559_81423", "52809_82630", "52941_82763", "55694_85516", "51784_81648",
"56016_85837_99974", "51527_81391", "52251_82114", "55686_85508",
"51480_81344", "52364_82227", "55825_85646", "53000_82822", "51613_81477",
"52564_82389", "51639_81503", "52343_82206", "51833_81697", "52926_82748",
"51394_81259", "51716_82528", "51380_81245", "51555_81419", "51336_81201",
"51336_81201", "52844_82665", "52228_82091", "56083_85903", "51460_81325",
"52006_81870", "51956_81820", "51672_81536", "55788_85609")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -193L))
dat_s = transform(dat,
x = as.numeric(scale(x)),
z = as.numeric(scale(z)))
m <- zeroinfl(y ~ x * z, data = dat_s, dist = "negbin")
gge <- ggpredict(m, c("x", "z")) |>
as.data.frame() |>
datawizard::data_arrange(c("group", "x"))
nd <- data_grid(m, c("x", "z"))
pr <- predict(m, newdata = nd, type = "count")
cbind(gge[c(1:2)], pr)
#> x predicted pr
#> 1 -4.0 15.173110 15.173110
#> 2 -3.5 13.024143 13.024143
#> 3 -3.0 11.179534 11.179534
#> 4 -2.5 9.596178 9.596178
#> 5 -2.0 8.237072 8.237072
#> 6 -1.5 7.070456 7.070456
#> 7 -1.0 6.069067 6.069067
#> 8 -0.5 5.209506 5.209506
#> 9 0.0 4.471684 4.471684
#> 10 0.5 3.838359 3.838359
#> 11 1.0 3.294733 3.294733
#> 12 1.5 2.828100 2.828100
#> 13 2.0 2.427556 2.427556
#> 14 2.5 2.083742 2.083742
#> 15 -4.0 9.860040 9.860040
#> 16 -3.5 8.880999 8.880999
#> 17 -3.0 7.999171 7.999171
#> 18 -2.5 7.204903 7.204903
#> 19 -2.0 6.489501 6.489501
#> 20 -1.5 5.845133 5.845133
#> 21 -1.0 5.264748 5.264748
#> 22 -0.5 4.741991 4.741991
#> 23 0.0 4.271140 4.271140
#> 24 0.5 3.847043 3.847043
#> 25 1.0 3.465055 3.465055
#> 26 1.5 3.120997 3.120997
#> 27 2.0 2.811101 2.811101
#> 28 2.5 2.531976 2.531976
#> 29 -4.0 6.407414 6.407414
#> 30 -3.5 6.055842 6.055842
#> 31 -3.0 5.723560 5.723560
#> 32 -2.5 5.409511 5.409511
#> 33 -2.0 5.112693 5.112693
#> 34 -1.5 4.832161 4.832161
#> 35 -1.0 4.567023 4.567023
#> 36 -0.5 4.316432 4.316432
#> 37 0.0 4.079591 4.079591
#> 38 0.5 3.855746 3.855746
#> 39 1.0 3.644182 3.644182
#> 40 1.5 3.444228 3.444228
#> 41 2.0 3.255244 3.255244
#> 42 2.5 3.076630 3.076630
gge <- ggpredict(m, c("x", "z"), type = "zero_inflated") |>
as.data.frame() |>
datawizard::data_arrange(c("group", "x"))
#> Error: Confidence intervals could not be computed.
#> Possibly a polynomial term is held constant (and does not appear in the `terms`-argument). Or try reducing number of simulation, using argument `nsim` (e.g. `nsim = 100`).
pr <- predict(m, newdata = nd, type = "response")
cbind(gge[c(1:2)], pr)
#> x predicted pr
#> 1 -4.0 15.0455972 15.0455972
#> 2 -3.5 12.8644765 12.8644765
#> 3 -3.0 10.9799616 10.9799616
#> 4 -2.5 9.3473671 9.3473671
#> 5 -2.0 7.9280255 7.9280255
#> 6 -1.5 6.6886225 6.6886225
#> 7 -1.0 5.6008281 5.6008281
#> 8 -0.5 4.6412670 4.6412670
#> 9 0.0 3.7918215 3.7918215
#> 10 0.5 3.0401065 3.0401065
#> 11 1.0 2.3796757 2.3796757
#> 12 1.5 1.8092538 1.8092538
#> 13 2.0 1.3303978 1.3303978
#> 14 2.5 0.9438429 0.9438429
#> 15 -4.0 9.7799500 9.7799500
#> 16 -3.5 8.7737046 8.7737046
#> 17 -3.0 7.8557104 7.8557104
#> 18 -2.5 7.0136347 7.0136347
#> 19 -2.0 6.2355627 6.2355627
#> 20 -1.5 5.5100449 5.5100449
#> 21 -1.0 4.8264359 4.8264359
#> 22 -0.5 4.1757015 4.1757015
#> 23 0.0 3.5518339 3.5518339
#> 24 0.5 2.9537599 2.9537599
#> 25 1.0 2.3870383 2.3870383
#> 26 1.5 1.8639475 1.8639475
#> 27 2.0 1.4005811 1.4005811
#> 28 2.5 1.0112068 1.0112068
#> 29 -4.0 6.3571099 6.3571099
#> 30 -3.5 5.9837407 5.9837407
#> 31 -3.0 5.6204351 5.6204351
#> 32 -2.5 5.2624792 5.2624792
#> 33 -2.0 4.9040515 4.9040515
#> 34 -1.5 4.5381588 4.5381588
#> 35 -1.0 4.1569401 4.1569401
#> 36 -0.5 3.7527058 3.7527058
#> 37 0.0 3.3201310 3.3201310
#> 38 0.5 2.8596709 2.8596709
#> 39 1.0 2.3811990 2.3811990
#> 40 1.5 1.9053526 1.9053526
#> 41 2.0 1.4598454 1.4598454
#> 42 2.5 1.0709699 1.0709699 Created on 2023-08-07 with reprex v2.0.2 |
This function fails to compute standard errors: ggeffects/R/predict_zero_inflation.R Line 229 in bdbf697
Basically, predicted values are correct, it's just that SE cannot be calculated. The idea why I use simulations to calculate the SE is based on the discussion in Brooks et al. and from the glmmTMB-GitHub repo. The package authors argue that predicted values on the response scales are easy to calculate, however, standard errors from the logit and the count model cannot be simply computed by multiplying them. |
Ok, this was not true, by default, the count component is returned. |
I'm, not sure this is a ggeffects issue, looks like the issue is that there's no library(pscl)
#> Classes and Methods for R developed in the
#> Political Science Computational Laboratory
#> Department of Political Science
#> Stanford University
#> Simon Jackman
#> hurdle and zeroinfl functions by Achim Zeileis
library(glmmTMB)
data(Salamanders)
m1 <- zeroinfl(count ~ mined | mined, dist = "poisson", data = Salamanders)
clubSandwich::vcovCR(m1, type = "CR1", cluster = Salamanders$site)
#> Registered S3 method overwritten by 'clubSandwich':
#> method from
#> bread.mlm sandwich
#> Error in nobs.default(obj): no 'nobs' method is available Created on 2023-08-08 with reprex v2.0.2 |
Ah great, thanks a lot for taking the time to clarify. I can confirm that the predictions are exactly identical to both Thanks also for the point about simulations. Yes, the standard errors are not easy to compute with simple matrix multiplications, but in principle I don't think this should affect the numerical derivatives approach I use in Agree this is not a Thanks! |
Hi @strengejacke,
I’m trying to replicate results with
ggpredict()
to answer this question on SO: https://stackoverflow.com/questions/76780457/using-cluster-robust-ses-within-a-zero-inflated-negative-binomial-model-through/76824215?noredirect=1#comment135445721_76824215Do you know why:
ggpredict()
does not appear to return the same predictions aspredict()
ggpredict
than withmarginaleffects::predictions()
and delta method?The text was updated successfully, but these errors were encountered: