replacing `{broom}` and `{broom.mixed}` tidiers with `{parameters}` package to reduce no. of dependencies #152

IndrajeetPatil · 2021-02-18T14:58:21Z

Before making a PR related to this, I was wondering if you would be open to this. If you agree, I will open a PR.

rationale

parameters (https://easystats.github.io/parameters/) has way fewer dependencies and can handle pretty much every model that broom and broom.mixed combined support. It offers a number of other additional features not in broom (e.g., robust SEs, standardization, etc.)

dependency calculations

tools::package_dependencies(c("broom", "broom.mixed", "parameters"), recursive = TRUE)
#> $broom
#>  [1] "backports"    "dplyr"        "ellipsis"     "generics"     "glue"        
#>  [6] "methods"      "purrr"        "rlang"        "stringr"      "tibble"      
#> [11] "tidyr"        "ggplot2"      "lifecycle"    "magrittr"     "R6"          
#> [16] "tidyselect"   "utils"        "vctrs"        "pillar"       "digest"      
#> [21] "grDevices"    "grid"         "gtable"       "isoband"      "MASS"        
#> [26] "mgcv"         "scales"       "stats"        "withr"        "stringi"     
#> [31] "fansi"        "pkgconfig"    "cpp11"        "graphics"     "nlme"        
#> [36] "Matrix"       "splines"      "cli"          "crayon"       "utf8"        
#> [41] "farver"       "labeling"     "munsell"      "RColorBrewer" "viridisLite" 
#> [46] "tools"        "lattice"      "colorspace"  
#> 
#> $broom.mixed
#>  [1] "broom"        "coda"         "dplyr"        "methods"      "nlme"        
#>  [6] "purrr"        "stringr"      "tibble"       "tidyr"        "backports"   
#> [11] "ellipsis"     "generics"     "glue"         "rlang"        "ggplot2"     
#> [16] "lattice"      "lifecycle"    "magrittr"     "R6"           "tidyselect"  
#> [21] "utils"        "vctrs"        "pillar"       "graphics"     "stats"       
#> [26] "stringi"      "fansi"        "pkgconfig"    "cpp11"        "grDevices"   
#> [31] "digest"       "grid"         "gtable"       "isoband"      "MASS"        
#> [36] "mgcv"         "scales"       "withr"        "cli"          "crayon"      
#> [41] "utf8"         "tools"        "Matrix"       "splines"      "farver"      
#> [46] "labeling"     "munsell"      "RColorBrewer" "viridisLite"  "colorspace"  
#> 
#> $parameters
#> [1] "bayestestR" "datawizard" "insight"    "graphics"   "methods"   
#> [6] "stats"      "utils"

^{Created on 2021-11-03 by the reprex package (v2.0.1)}

example with `merMod`

library(lme4)
#> Loading required package: Matrix
library(magrittr)
library(parameters)

lmer_mod <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

broom.mixed::tidy(lmer_mod, effects = "fixed")
#> # A tibble: 2 x 5
#>   effect term        estimate std.error statistic
#>   <chr>  <chr>          <dbl>     <dbl>     <dbl>
#> 1 fixed  (Intercept)    251.       6.82     36.8 
#> 2 fixed  Days            10.5      1.55      6.77

parameters::standardize_names(parameters::model_parameters(lmer_mod), style = "broom") %>%
  tibble::as_tibble()
#> # A tibble: 2 x 9
#>   term  estimate std.error conf.level conf.low conf.high statistic df.error
#>   <chr>    <dbl>     <dbl>      <dbl>    <dbl>     <dbl>     <dbl>    <int>
#> 1 (Int…    251.       6.82       0.95   238.       265.      36.8       174
#> 2 Days      10.5      1.55       0.95     7.44      13.5      6.77      174
#> # … with 1 more variable: p.value <dbl>

example with `lm`

lm_mod <- lm(Reaction ~ Days, sleepstudy)

broom::tidy(lm_mod)
#> # A tibble: 2 x 5
#>   term        estimate std.error statistic  p.value
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
#> 1 (Intercept)    251.       6.61     38.0  2.16e-87
#> 2 Days            10.5      1.24      8.45 9.89e-15

parameters::standardize_names(parameters::model_parameters(lm_mod), style = "broom") %>%
  tibble::as_tibble()
#> # A tibble: 2 x 9
#>   term  estimate std.error conf.level conf.low conf.high statistic df.error
#>   <chr>    <dbl>     <dbl>      <dbl>    <dbl>     <dbl>     <dbl>    <int>
#> 1 (Int…    251.       6.61       0.95   238.       264.      38.0       178
#> 2 Days      10.5      1.24       0.95     8.02      12.9      8.45      178
#> # … with 1 more variable: p.value <dbl>

^{Created on 2021-02-18 by the reprex package (v1.0.0)}

The text was updated successfully, but these errors were encountered:

datalorax · 2021-02-18T17:36:19Z

I like the general idea but this would be a massive change and I'm not sure it's worth it. A lot of the current codebase depends on the output from broom looking exactly as it does now, so it would require considerable refactoring. For example, the lme4::lmer() code depends on having the effect column to delineate between fixed and random effects.

The other thing that worries me a little bit is just that broom is a really established package with considerable support around maintaining it. I've never really looked into parameters. It looks like it's pretty well maintained too. But it would still worry me a bit.

So I guess I'm leaning toward no thanks, but I'm happy to engage in the conversation a bit more.

IndrajeetPatil · 2021-02-18T17:51:04Z

For example, the lme4::lmer() code depends on having the effect column to delineate between fixed and random effects.

Hmm, that's a fair point. This is indeed a context where the parameters output won't exactly line up with the broom.mixed output, and this is a good enough reason to currently not make this switch.

The other thing that worries me a little bit is just that broom is a really established package with considerable support around maintaining it.

As someone who has contributed to both of these packages, I can vouch for the rigor and speed at which parameters is maintained (it is < 2 years old and already supports more models than broom and broom.mixed combined) and, in a few years, it will be as well-established as broom was at its age. 😉

So I guess I'm leaning toward no thanks, but I'm happy to engage in the conversation a bit more.

We can revisit this when parameters starts to behave the same way as broom.mixed when it comes to random effects. Since then the switch would require minimal refactoring.

datalorax · 2021-02-18T17:52:48Z

Sounds good to me. Thanks.

datalorax · 2021-03-05T16:54:44Z

Okay, I appreciate it. I'm hoping to come back to work on some bugs and things here in the next couple weeks. I suppose we could use the GitHub version as a dependency for now and then wait until they push to CRAN before our next release.

IndrajeetPatil · 2021-11-03T19:24:55Z

Just wanted to post another reprex, this time with CRAN versions of both packages.

As far as I can see, there are just two (IMO) minor differences, but not sure how much difference it makes to your code:

random effects are called ran_pars in {broom}, while random in {parameters}
group column strings are surrounded in ""

library(lme4)
#> Loading required package: Matrix
library(broom.mixed)
library(tibble)
library(parameters)

options(tibble.width = Inf)

mod <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

# `broom.mixed` output --------------------------------

tidy(mod)
#> # A tibble: 6 x 6
#>   effect   group    term                  estimate std.error statistic
#>   <chr>    <chr>    <chr>                    <dbl>     <dbl>     <dbl>
#> 1 fixed    <NA>     (Intercept)           251.          6.82     36.8 
#> 2 fixed    <NA>     Days                   10.5         1.55      6.77
#> 3 ran_pars Subject  sd__(Intercept)        24.7        NA        NA   
#> 4 ran_pars Subject  cor__(Intercept).Days   0.0656     NA        NA   
#> 5 ran_pars Subject  sd__Days                5.92       NA        NA   
#> 6 ran_pars Residual sd__Observation        25.6        NA        NA

# `parameters` output ---------------------------------
# (with further modications to match `broom` conventions)

model_parameters(mod, effects = "all") %>%
  standardize_names(style = "broom") %>%
  as_tibble()
#> # A tibble: 6 x 11
#>   term                          estimate std.error conf.level conf.low conf.high
#>   <chr>                            <dbl>     <dbl>      <dbl>    <dbl>     <dbl>
#> 1 (Intercept)                   251.          6.82       0.95   238.       265. 
#> 2 Days                           10.5         1.55       0.95     7.42      13.5
#> 3 SD (Intercept)                 24.7        NA          0.95    NA         NA  
#> 4 SD (Days)                       5.92       NA          0.95    NA         NA  
#> 5 Cor (Intercept~Days: Subject)   0.0656     NA          0.95    NA         NA  
#> 6 SD (Observations)              25.6        NA          0.95    NA         NA  
#>   statistic df.error   p.value effect group     
#>       <dbl>    <int>     <dbl> <chr>  <chr>     
#> 1     36.8       174  4.37e-84 fixed  ""        
#> 2      6.77      174  1.88e-10 fixed  ""        
#> 3     NA          NA NA        random "Subject" 
#> 4     NA          NA NA        random "Subject" 
#> 5     NA          NA NA        random "Subject" 
#> 6     NA          NA NA        random "Residual"

^{Created on 2021-11-03 by the reprex package (v2.0.1)}

datalorax · 2021-11-03T19:54:26Z

Thanks. Just to be clear, the parameters package handles the models that broom and broom.mixed handle, correct?

IndrajeetPatil · 2021-11-03T19:57:23Z

Yes, you can see the list of supported models using this function:

insight::supported_models()
#>   [1] "aareg"             "afex_aov"          "AKP"              
#>   [4] "Anova.mlm"         "aov"               "aovlist"          
#>   [7] "Arima"             "averaging"         "bamlss"           
#>  [10] "bamlss.frame"      "bayesQR"           "bayesx"           
#>  [13] "BBmm"              "BBreg"             "bcplm"            
#>  [16] "betamfx"           "betaor"            "betareg"          
#>  [19] "BFBayesFactor"     "bfsl"              "BGGM"             
#>  [22] "bife"              "bifeAPEs"          "bigglm"           
#>  [25] "biglm"             "blavaan"           "blrm"             
#>  [28] "bracl"             "brglm"             "brmsfit"          
#>  [31] "brmultinom"        "btergm"            "censReg"          
#>  [34] "cgam"              "cgamm"             "cglm"             
#>  [37] "clm"               "clm2"              "clmm"             
#>  [40] "clmm2"             "clogit"            "coeftest"         
#>  [43] "complmrob"         "confusionMatrix"   "coxme"            
#>  [46] "coxph"             "coxph.penal"       "coxr"             
#>  [49] "cpglm"             "cpglmm"            "crch"             
#>  [52] "crq"               "crqs"              "crr"              
#>  [55] "dep.effect"        "DirichletRegModel" "drc"              
#>  [58] "eglm"              "elm"               "epi.2by2"         
#>  [61] "ergm"              "feglm"             "feis"             
#>  [64] "felm"              "fitdistr"          "fixest"           
#>  [67] "flexsurvreg"       "gam"               "Gam"              
#>  [70] "gamlss"            "gamm"              "gamm4"            
#>  [73] "garch"             "gbm"               "gee"              
#>  [76] "geeglm"            "glht"              "glimML"           
#>  [79] "glm"               "Glm"               "glmm"             
#>  [82] "glmmadmb"          "glmmPQL"           "glmmTMB"          
#>  [85] "glmrob"            "glmRob"            "glmx"             
#>  [88] "gls"               "gmnl"              "HLfit"            
#>  [91] "htest"             "hurdle"            "iv_robust"        
#>  [94] "ivFixed"           "ivprobit"          "ivreg"            
#>  [97] "lavaan"            "lm"                "lm_robust"        
#> [100] "lme"               "lmerMod"           "lmerModLmerTest"  
#> [103] "lmodel2"           "lmrob"             "lmRob"            
#> [106] "logistf"           "logitmfx"          "logitor"          
#> [109] "LORgee"            "lqm"               "lqmm"             
#> [112] "lrm"               "manova"            "MANOVA"           
#> [115] "margins"           "maxLik"            "mclogit"          
#> [118] "mcmc"              "mcmc.list"         "MCMCglmm"         
#> [121] "mcp1"              "mcp12"             "mcp2"             
#> [124] "med1way"           "mediate"           "merMod"           
#> [127] "merModList"        "meta_bma"          "meta_fixed"       
#> [130] "meta_random"       "metaplus"          "mhurdle"          
#> [133] "mipo"              "mira"              "mixed"            
#> [136] "MixMod"            "mixor"             "mjoint"           
#> [139] "mle"               "mle2"              "mlm"              
#> [142] "mlogit"            "mmlogit"           "model_fit"        
#> [145] "multinom"          "mvord"             "negbinirr"        
#> [148] "negbinmfx"         "ols"               "onesampb"         
#> [151] "orm"               "pgmm"              "plm"              
#> [154] "PMCMR"             "poissonirr"        "poissonmfx"       
#> [157] "polr"              "probitmfx"         "psm"              
#> [160] "Rchoice"           "ridgelm"           "riskRegression"   
#> [163] "rjags"             "rlm"               "rlmerMod"         
#> [166] "RM"                "rma"               "rma.uni"          
#> [169] "robmixglm"         "robtab"            "rq"               
#> [172] "rqs"               "rqss"              "Sarlm"            
#> [175] "scam"              "selection"         "sem"              
#> [178] "SemiParBIV"        "semLm"             "semLme"           
#> [181] "slm"               "speedglm"          "speedlm"          
#> [184] "stanfit"           "stanmvreg"         "stanreg"          
#> [187] "summary.lm"        "survfit"           "survreg"          
#> [190] "svy_vglm"          "svyglm"            "svyolr"           
#> [193] "t1way"             "tobit"             "trimcibt"         
#> [196] "truncreg"          "vgam"              "vglm"             
#> [199] "wbgee"             "wblm"              "wbm"              
#> [202] "wmcpAKP"           "yuen"              "yuend"            
#> [205] "zcpglm"            "zeroinfl"          "zerotrunc"

^{Created on 2021-11-03 by the reprex package (v2.0.1)}

datalorax · 2021-11-03T20:09:43Z

Thanks, I'll play around with this in a bit.

IndrajeetPatil · 2021-11-03T20:18:08Z

Cool!

The documentation can be found here: https://easystats.github.io/parameters/

strengejacke · 2021-11-03T20:35:34Z

group column strings are surrounded in ""

Only in the printed output. That's because parameters uses an empty string in "group" for fixed effects, while broom.mixed uses NA. And for character columns, including empty strings, tibble adds a surrounding ".

McCartneyAC · 2023-01-17T22:16:29Z

I would find this helpful--easystats is quickly becoming a huge part of my workflow and it would open up a huge number of classes to switch to {parameters} instead.

IndrajeetPatil closed this as completed Feb 18, 2021

This comment has been minimized.

Sign in to view

datalorax reopened this Nov 3, 2021

IndrajeetPatil changed the title ~~replacing broom tidiers with parameters to reduce no. of dependencies~~ replacing {broom} and {broom.mixed} tidiers with {parameters} package to reduce no. of dependencies Nov 3, 2021

datalorax mentioned this issue Aug 23, 2024

Would it make sense to work directly with Tidymodel objects of class model_fit #236

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replacing `{broom}` and `{broom.mixed}` tidiers with `{parameters}` package to reduce no. of dependencies #152

replacing `{broom}` and `{broom.mixed}` tidiers with `{parameters}` package to reduce no. of dependencies #152

IndrajeetPatil commented Feb 18, 2021 •

edited

Loading

datalorax commented Feb 18, 2021

IndrajeetPatil commented Feb 18, 2021

datalorax commented Feb 18, 2021

This comment has been minimized.

datalorax commented Mar 5, 2021

IndrajeetPatil commented Nov 3, 2021

datalorax commented Nov 3, 2021

IndrajeetPatil commented Nov 3, 2021

datalorax commented Nov 3, 2021

IndrajeetPatil commented Nov 3, 2021

strengejacke commented Nov 3, 2021 •

edited

Loading

McCartneyAC commented Jan 17, 2023

replacing {broom} and {broom.mixed} tidiers with {parameters} package to reduce no. of dependencies #152

replacing {broom} and {broom.mixed} tidiers with {parameters} package to reduce no. of dependencies #152

Comments

IndrajeetPatil commented Feb 18, 2021 • edited Loading

rationale

dependency calculations

example with merMod

example with lm

datalorax commented Feb 18, 2021

IndrajeetPatil commented Feb 18, 2021

datalorax commented Feb 18, 2021

This comment has been minimized.

datalorax commented Mar 5, 2021

IndrajeetPatil commented Nov 3, 2021

datalorax commented Nov 3, 2021

IndrajeetPatil commented Nov 3, 2021

datalorax commented Nov 3, 2021

IndrajeetPatil commented Nov 3, 2021

strengejacke commented Nov 3, 2021 • edited Loading

McCartneyAC commented Jan 17, 2023

replacing `{broom}` and `{broom.mixed}` tidiers with `{parameters}` package to reduce no. of dependencies #152

replacing `{broom}` and `{broom.mixed}` tidiers with `{parameters}` package to reduce no. of dependencies #152

IndrajeetPatil commented Feb 18, 2021 •

edited

Loading

example with `merMod`

example with `lm`

strengejacke commented Nov 3, 2021 •

edited

Loading