Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help needed with table with one model per row #207

Open
ale275 opened this issue Dec 16, 2021 · 5 comments
Open

Help needed with table with one model per row #207

ale275 opened this issue Dec 16, 2021 · 5 comments

Comments

@ale275
Copy link

ale275 commented Dec 16, 2021

Hi

I successuflly used this package to make report for a single model run but now I am facing anissue with my current implementation that comprise a table with a different model on each row.

I have to apply different models to different defect in one shot, my approach was to create a table that we will call model.summary, I left join it to data.meas.long that is the table that contains the value for each defect creating a dataset with both formulas and data for the model.

def.models <- data.meas.long %>% 
    left_join(model.summary, by = c('defect_id')) %>% 
    group_by( defect_id) %>%
    summarise(
      model = list(lm(as.formula(unique(model_formula)), data = across()))
    ) %>% 
    ungroup() %>% 
    mutate(has_model = 1)

With this approach I can correctly compute the model for each defect_id, each model is nested inside model column of class list.
What I am trying to do is to use extract_eq in combination with mutate to try to create a table to be rendered to HTML via RMarkdown + kable() to generate a mail report for the model run.

I tried both the approaches below but non seems to work correctly.

def.models %>% 
  group_by(defect_id) %>% 
  mutate(
    lat = equatiomatic::extract_eq(across()$model[[1]])
  ) %>% 
  ungroup() %>% 
  kable()
def.models %>% 
  rowwise() %>%  
  mutate(lat = equatiomatic::extract_eq(model, use_coefs = F)) %>% 
  kable()

The rendered table contains for each row all the formula for all the other rows

image

I don't think there is nothing wrong with the library but more in the way I am using it. Does anyone have any idea?

Session info

sessioninfo::session_info()
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                                      
 version  R version 4.1.1 (2021-08-10)               
 os       Red Hat Enterprise Linux Server 7.9 (Maipo)
 system   x86_64, linux-gnu                          
 ui       RStudio                                    
 language (EN)                                       
 collate  en_US.UTF-8                                
 ctype    en_US.UTF-8                                
 tz       Europe/Rome                                
 date     2021-12-16                                 

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version    date       lib source                                 
 backports      1.2.1      2020-12-09 [1] CRAN (R 4.1.1)                         
 blastula     * 0.3.2      2020-05-19 [1] CRAN (R 4.1.1)                         
 broom          0.7.9      2021-07-27 [1] CRAN (R 4.1.1)                         
 bslib          0.3.0      2021-09-02 [1] CRAN (R 4.1.1)                         
 cli            3.0.1      2021-07-17 [1] CRAN (R 4.1.1)                         
 colorspace     2.0-2      2021-06-24 [1] CRAN (R 4.1.1)                         
 crayon         1.4.1      2021-02-08 [1] CRAN (R 4.1.1)                         
 curl           4.3.2      2021-06-23 [1] CRAN (R 4.1.1)                         
 digest         0.6.27     2020-10-24 [1] CRAN (R 4.1.1)                         
 dplyr        * 1.0.7      2021-06-18 [1] CRAN (R 4.1.1)                         
 ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.1.1)                         
 equatiomatic   0.3.0.9000 2021-12-16 [1] Github (datalorax/equatiomatic@e98e75c)
 evaluate       0.14       2019-05-28 [1] CRAN (R 4.1.1)                         
 fansi          0.5.0      2021-05-25 [1] CRAN (R 4.1.1)                         
 fastmap        1.1.0      2021-01-25 [1] CRAN (R 4.1.1)                         
 generics       0.1.0      2020-10-31 [1] CRAN (R 4.1.1)                         
 getopt         1.20.3     2019-03-22 [1] CRAN (R 4.1.1)                         
 glue           1.4.2      2020-08-27 [1] CRAN (R 4.1.1)                         
 highr          0.9        2021-04-16 [1] CRAN (R 4.1.1)                         
 htmltools      0.5.2      2021-08-25 [1] CRAN (R 4.1.1)                         
 httpuv         1.6.2      2021-08-18 [1] CRAN (R 4.1.1)                         
 httr         * 1.4.2      2020-07-20 [1] CRAN (R 4.1.1)                         
 jquerylib      0.1.4      2021-04-26 [1] CRAN (R 4.1.1)                         
 jsonlite     * 1.7.2      2020-12-09 [1] CRAN (R 4.1.1)                         
 kableExtra   * 1.3.4      2021-02-20 [1] CRAN (R 4.1.1)                         
 knitr          1.34       2021-09-09 [1] CRAN (R 4.1.1)                         
 later          1.3.0      2021-08-18 [1] CRAN (R 4.1.1)                         
 lifecycle      1.0.0      2021-02-15 [1] CRAN (R 4.1.1)                         
 log4r        * 0.3.2      2020-01-18 [1] CRAN (R 4.1.1)                         
 lubridate    * 1.7.10     2021-02-26 [1] CRAN (R 4.1.1)                         
 magrittr       2.0.1      2020-11-17 [1] CRAN (R 4.1.1)                         
 mailR        * 0.4.1      2015-01-14 [1] CRAN (R 4.1.1)                         
 mime           0.11       2021-06-23 [1] CRAN (R 4.1.1)                         
 modelr       * 0.1.8      2020-05-19 [1] CRAN (R 4.1.1)                         
 munsell        0.5.0      2018-06-12 [1] CRAN (R 4.1.1)                         
 optparse     * 1.6.6      2020-04-16 [1] CRAN (R 4.1.1)                         
 pillar         1.6.2      2021-07-29 [1] CRAN (R 4.1.1)                         
 pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.1.1)                         
 promises       1.2.0.1    2021-02-11 [1] CRAN (R 4.1.1)                         
 purrr          0.3.4      2020-04-17 [1] CRAN (R 4.1.1)                         
 R.methodsS3    1.8.1      2020-08-26 [1] CRAN (R 4.1.1)                         
 R.oo           1.24.0     2020-08-26 [1] CRAN (R 4.1.1)                         
 R.utils        2.10.1     2020-08-26 [1] CRAN (R 4.1.1)                         
 R6             2.5.1      2021-08-19 [1] CRAN (R 4.1.1)                         
 Rcpp           1.0.7      2021-07-07 [1] CRAN (R 4.1.1)                         
 rJava          1.0-4      2021-04-29 [1] CRAN (R 4.1.1)                         
 rlang          0.4.11     2021-04-30 [1] CRAN (R 4.1.1)                         
 rmarkdown      2.10       2021-08-06 [1] CRAN (R 4.1.1)                         
 rstudioapi     0.13       2020-11-12 [1] CRAN (R 4.1.1)                         
 rvest          1.0.1      2021-07-26 [1] CRAN (R 4.1.1)                         
 sass           0.4.0      2021-05-12 [1] CRAN (R 4.1.1)                         
 scales         1.1.1      2020-05-11 [1] CRAN (R 4.1.1)                         
 sessioninfo    1.1.1      2018-11-05 [1] CRAN (R 4.1.1)                         
 shiny          1.6.0      2021-01-25 [1] CRAN (R 4.1.1)                         
 stringi      * 1.7.4      2021-08-25 [1] CRAN (R 4.1.1)                         
 stringr      * 1.4.0      2019-02-10 [1] CRAN (R 4.1.1)                         
 svglite        2.0.0      2021-02-20 [1] CRAN (R 4.1.1)                         
 systemfonts    1.0.2      2021-05-11 [1] CRAN (R 4.1.1)                         
 tibble         3.1.4      2021-08-25 [1] CRAN (R 4.1.1)                         
 tictoc       * 1.0.1      2021-04-19 [1] CRAN (R 4.1.1)                         
 tidyr        * 1.1.3      2021-03-03 [1] CRAN (R 4.1.1)                         
 tidyselect     1.1.1      2021-04-30 [1] CRAN (R 4.1.1)                         
 utf8           1.2.2      2021-07-24 [1] CRAN (R 4.1.1)                         
 vctrs          0.3.8      2021-04-29 [1] CRAN (R 4.1.1)                         
 viridisLite    0.4.0      2021-04-13 [1] CRAN (R 4.1.1)                         
 webshot        0.5.2      2019-11-22 [1] CRAN (R 4.1.1)                         
 withr          2.4.2      2021-04-18 [1] CRAN (R 4.1.1)                         
 xfun           0.25       2021-08-06 [1] CRAN (R 4.1.1)                         
 xml2           1.3.2      2020-04-23 [1] CRAN (R 4.1.1)                         
 xtable         1.8-4      2019-04-21 [1] CRAN (R 4.1.1)                         
 yaml           2.2.1      2020-02-01 [1] CRAN (R 4.1.1)                         

thank you in advance
Alessandro

@datalorax
Copy link
Owner

Hi Alessandro, can you please post a reproducible example?

@ale275
Copy link
Author

ale275 commented Dec 17, 2021

Sure thing

here it is. Strangely now the behavior is different: all the formulas are in the first row only and the other are set to NA. It is coherent between test script and real production one.

library(dplyr)
library(rmarkdown)
library(knitr)
library(modelr)


model.summary <- data.frame(
  lot           = c('Lot1', 'Lot1', 'Lot2'),
  defect_id     = c('Def1', 'Def2', 'Def3'),
  model_formula = c('value ~ I(interp_along)', 'value ~ I(interp_along)', 'value ~ I(interp_along + 10)')
)

data.meas.long <- data.frame(
  lot          = rep(c('Lot1', 'Lot1', 'Lot2'), each = 2),
  defect_id    = rep(c('Def1', 'Def2', 'Def3'), each = 2),
  interp_along = c(1, 10, 20, 30, 15, 17),
  value        = c(4,  7,  5,  2, 7 ,8)
)


def.models <- data.meas.long %>% 
  left_join(model.summary, by = c('lot', 'defect_id')) %>% 
  filter(is.na(value) == F) %>% 
  group_by(lot, defect_id) %>%
  summarise(
    model = list(lm(as.formula(as.character(model_formula)), data = across()))
  ) %>% 
  ungroup() %>% 
  mutate(has_model = 1)

rmd <- paste(
  "---",
  "title: \"&nbsp;\"",
  "mainfont: Arial",
  "output: ",
  "  html_document: ",
  "    self_contained: false",
  "    anchor_sections: false",
  "---",
  "  ",
  "```{r setup, include=FALSE}",
  "knitr::opts_chunk$set(echo = FALSE)",
  "options(knitr.kable.NA = '')",
  "library(dplyr)",
  "library(knitr)",
  "library(equatiomatic)",
  "library(kableExtra)",
  "```",
  "",
  "Some text",
  "",
  "```{r test}",
  "",
  "def.models %>% ",
  "  group_by(defect_id) %>% ",
  "  mutate(",
  "    lat = extract_eq(across()$model[[1]])",
  "  ) %>% ",
  "  ungroup() %>% ",
  "  select(-model) %>%  ",
  "  kable() %>% ",
  "  kable_styling(bootstrap_options = 'striped', full_width = T)",
  "",
  "```",
  "",
  "Some other text",
  "",
  "```{r test2}",
  "",
  "def.models %>% ",
  "  rowwise() %>%  ",
  "  mutate(lat = equatiomatic::extract_eq(model, use_coefs = F)) %>% ",
  "  select(-model) %>%  ",
  "  kable() %>% ",
  "  kable_styling(bootstrap_options = 'striped', full_width = T)",
  "",
  "```",
  "",
  sep = "\n"
)

# Generate report
writeLines(rmd, con = './test.Rmd')
out.rep.file <- './out_test.html'
rmarkdown::render('./test.Rmd', output_file = out.rep.file)
Session info

─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                                      
 version  R version 4.1.1 (2021-08-10)               
 os       Red Hat Enterprise Linux Server 7.9 (Maipo)
 system   x86_64, linux-gnu                          
 ui       RStudio                                    
 language (EN)                                       
 collate  en_US.UTF-8                                
 ctype    en_US.UTF-8                                
 tz       Europe/Rome                                
 date     2021-12-17                                 

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version    date       lib source                                 
 backports      1.2.1      2020-12-09 [1] CRAN (R 4.1.1)                         
 blastula     * 0.3.2      2020-05-19 [1] CRAN (R 4.1.1)                         
 broom          0.7.9      2021-07-27 [1] CRAN (R 4.1.1)                         
 bslib          0.3.0      2021-09-02 [1] CRAN (R 4.1.1)                         
 cli            3.0.1      2021-07-17 [1] CRAN (R 4.1.1)                         
 colorspace     2.0-2      2021-06-24 [1] CRAN (R 4.1.1)                         
 crayon         1.4.1      2021-02-08 [1] CRAN (R 4.1.1)                         
 curl           4.3.2      2021-06-23 [1] CRAN (R 4.1.1)                         
 digest         0.6.27     2020-10-24 [1] CRAN (R 4.1.1)                         
 dplyr        * 1.0.7      2021-06-18 [1] CRAN (R 4.1.1)                         
 ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.1.1)                         
 equatiomatic * 0.3.0.9000 2021-12-16 [1] Github (datalorax/equatiomatic@e98e75c)
 evaluate       0.14       2019-05-28 [1] CRAN (R 4.1.1)                         
 fansi          0.5.0      2021-05-25 [1] CRAN (R 4.1.1)                         
 fastmap        1.1.0      2021-01-25 [1] CRAN (R 4.1.1)                         
 generics       0.1.0      2020-10-31 [1] CRAN (R 4.1.1)                         
 getopt         1.20.3     2019-03-22 [1] CRAN (R 4.1.1)                         
 glue           1.4.2      2020-08-27 [1] CRAN (R 4.1.1)                         
 highr          0.9        2021-04-16 [1] CRAN (R 4.1.1)                         
 htmltools      0.5.2      2021-08-25 [1] CRAN (R 4.1.1)                         
 httpuv         1.6.2      2021-08-18 [1] CRAN (R 4.1.1)                         
 httr         * 1.4.2      2020-07-20 [1] CRAN (R 4.1.1)                         
 jquerylib      0.1.4      2021-04-26 [1] CRAN (R 4.1.1)                         
 jsonlite     * 1.7.2      2020-12-09 [1] CRAN (R 4.1.1)                         
 kableExtra   * 1.3.4      2021-02-20 [1] CRAN (R 4.1.1)                         
 knitr        * 1.34       2021-09-09 [1] CRAN (R 4.1.1)                         
 later          1.3.0      2021-08-18 [1] CRAN (R 4.1.1)                         
 lifecycle      1.0.0      2021-02-15 [1] CRAN (R 4.1.1)                         
 log4r        * 0.3.2      2020-01-18 [1] CRAN (R 4.1.1)                         
 lubridate    * 1.7.10     2021-02-26 [1] CRAN (R 4.1.1)                         
 magrittr       2.0.1      2020-11-17 [1] CRAN (R 4.1.1)                         
 mailR        * 0.4.1      2015-01-14 [1] CRAN (R 4.1.1)                         
 mime           0.11       2021-06-23 [1] CRAN (R 4.1.1)                         
 modelr       * 0.1.8      2020-05-19 [1] CRAN (R 4.1.1)                         
 munsell        0.5.0      2018-06-12 [1] CRAN (R 4.1.1)                         
 optparse     * 1.6.6      2020-04-16 [1] CRAN (R 4.1.1)                         
 pillar         1.6.2      2021-07-29 [1] CRAN (R 4.1.1)                         
 pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.1.1)                         
 promises       1.2.0.1    2021-02-11 [1] CRAN (R 4.1.1)                         
 purrr          0.3.4      2020-04-17 [1] CRAN (R 4.1.1)                         
 R.methodsS3    1.8.1      2020-08-26 [1] CRAN (R 4.1.1)                         
 R.oo           1.24.0     2020-08-26 [1] CRAN (R 4.1.1)                         
 R.utils        2.10.1     2020-08-26 [1] CRAN (R 4.1.1)                         
 R6             2.5.1      2021-08-19 [1] CRAN (R 4.1.1)                         
 Rcpp           1.0.7      2021-07-07 [1] CRAN (R 4.1.1)                         
 rJava          1.0-4      2021-04-29 [1] CRAN (R 4.1.1)                         
 rlang          0.4.11     2021-04-30 [1] CRAN (R 4.1.1)                         
 rmarkdown    * 2.10       2021-08-06 [1] CRAN (R 4.1.1)                         
 rstudioapi     0.13       2020-11-12 [1] CRAN (R 4.1.1)                         
 rvest          1.0.1      2021-07-26 [1] CRAN (R 4.1.1)                         
 sass           0.4.0      2021-05-12 [1] CRAN (R 4.1.1)                         
 scales         1.1.1      2020-05-11 [1] CRAN (R 4.1.1)                         
 sessioninfo    1.1.1      2018-11-05 [1] CRAN (R 4.1.1)                         
 shiny          1.6.0      2021-01-25 [1] CRAN (R 4.1.1)                         
 stringi      * 1.7.4      2021-08-25 [1] CRAN (R 4.1.1)                         
 stringr      * 1.4.0      2019-02-10 [1] CRAN (R 4.1.1)                         
 svglite        2.0.0      2021-02-20 [1] CRAN (R 4.1.1)                         
 systemfonts    1.0.2      2021-05-11 [1] CRAN (R 4.1.1)                         
 tibble         3.1.4      2021-08-25 [1] CRAN (R 4.1.1)                         
 tictoc       * 1.0.1      2021-04-19 [1] CRAN (R 4.1.1)                         
 tidyr        * 1.1.3      2021-03-03 [1] CRAN (R 4.1.1)                         
 tidyselect     1.1.1      2021-04-30 [1] CRAN (R 4.1.1)                         
 utf8           1.2.2      2021-07-24 [1] CRAN (R 4.1.1)                         
 vctrs          0.3.8      2021-04-29 [1] CRAN (R 4.1.1)                         
 viridisLite    0.4.0      2021-04-13 [1] CRAN (R 4.1.1)                         
 webshot        0.5.2      2019-11-22 [1] CRAN (R 4.1.1)                         
 withr          2.4.2      2021-04-18 [1] CRAN (R 4.1.1)                         
 xfun           0.25       2021-08-06 [1] CRAN (R 4.1.1)                         
 xml2           1.3.2      2020-04-23 [1] CRAN (R 4.1.1)                         
 xtable         1.8-4      2019-04-21 [1] CRAN (R 4.1.1)                         
 yaml           2.2.1      2020-02-01 [1] CRAN (R 4.1.1) 

@ale275
Copy link
Author

ale275 commented Feb 12, 2022

I updated the above example to make it work.

@ale275
Copy link
Author

ale275 commented Feb 12, 2022

I have done further digging into the issue, the problem seems to be when the tibble is converted to data.frame inside Kable function

So I started by creating a data.frame

ibrary(equatiomatic)
library(dplyr)
library(rmarkdown)
library(knitr)
library(modelr)

model.summary <- data.frame(
  lot           = c('Lot1', 'Lot1', 'Lot2'),
  defect_id     = c('Def1', 'Def2', 'Def3'),
  model_formula = c('value ~ I(interp_along)', 'value ~ I(interp_along)', 'value ~ I(interp_along + 10)')
)

data.meas.long <- data.frame(
  lot          = rep(c('Lot1', 'Lot1', 'Lot2'), each = 2),
  defect_id    = rep(c('Def1', 'Def2', 'Def3'), each = 2),
  interp_along = c(1, 10, 20, 30, 15, 17),
  value        = c(4,  7,  5,  2, 7 ,8)
)


def.models <- data.meas.long %>% 
  left_join(model.summary, by = c('lot', 'defect_id')) %>% 
  filter(is.na(value) == F) %>% 
  group_by(lot, defect_id) %>%
  summarise(
    model = list(lm(as.formula(as.character(model_formula)), data = across()))
  ) %>% 
  ungroup() %>% 
  mutate(has_model = 1)

equation.list <- apply(
  def.models,
  1,
  function(x) {
    extract_eq(x[["model"]])
  }
)

data.df <- data.frame(
  lot = def.models$lot,
  model.formula = unlist(equation.list),
  stringsAsFactors = F
)
> data.df
   lot                                                                                     model.formula
1 Lot1          \\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along}) + \\epsilon
2 Lot1          \\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along}) + \\epsilon
3 Lot2 \\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along\\ +\\ 10}) + \\epsilon
> class(data.df)
[1] "data.frame"
> class(data.df$model.formula)
[1] "character"

Calling Kable like this will fail the render since the class of the column is wrong, when i try to set the class to formula or formula, character is when the mess happens

class(data.df$model.formula) <- c("equation", "character")
> data.df
   lot
1 Lot1
2 Lot1
3 Lot2
                                                                                                                                                                                                                                                                                model.formula
1 $$\n\\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along}) + \\epsilon\\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along}) + \\epsilon\\operatorname{value} = \\alpha + \\beta_{1}(\\operatorname{interp\\_along\\ +\\ 10}) + \\epsilon\n$$\n
2                                                                                                                                                                                                                                                                                        <NA>
3                                                                                                                                                                                                                                                                                        <NA>
Warning message:
In format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,  :
  corrupt data frame: columns will be truncated or padded with NAs

If data.frame is composed by a single row, everything works as expected. My guess is that some kind of extension to as.data.frame for class formula must be done. Is my hunch correct?

@datalorax
Copy link
Owner

Thanks, I still haven't had a chance to look at this at all, but I appreciate the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants