Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected error with flatline_forecaster() w/ quantile_by_key; unexpected "successes" w/ invalid cols #229

Open
Tracked by #318
brookslogan opened this issue Aug 29, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@brookslogan
Copy link
Contributor

brookslogan commented Aug 29, 2023

  • quantile_by_key = "geo_value" [seems to work with] arx_forecaster(), but [generates an error with] flatline_forecaster(), [and is accepted by arx_forecaster() with quantile_reg() despite not being applicable]
  • quantile_by_key = "nonexistent_column" is accepted by both
library(epipredict)
#> Loading required package: epiprocess
#> 
#> Attaching package: 'epiprocess'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> Loading required package: parsnip
# trying residuals by geo_value results in error:
fcst1 <- flatline_forecaster(
  case_death_rate_subset, "death_rate",
  flatline_args_list(quantile_by_key = "geo_value")
)
#> New names:
#> • `geo_value` -> `geo_value...1`
#> • `geo_value` -> `geo_value...2`
#> Error in `dplyr::group_by()`:
#> ! Must group by variables found in `.data`.
#> ✖ Column `geo_value` is not found.
#> Backtrace:
#>      ▆
#>   1. ├─epipredict::flatline_forecaster(...)
#>   2. │ ├─... %>% dplyr::select(-time_value) at cmu-delphi-epipredict-2dd9e70/R/flatline_forecaster.R:73:2
#>   3. │ ├─base::suppressWarnings(predict(wf, new_data = latest))
#>   4. │ │ └─base::withCallingHandlers(...)
#>   5. │ ├─stats::predict(wf, new_data = latest)
#>   6. │ └─epipredict:::predict.epi_workflow(wf, new_data = latest)
#>   7. │   ├─epipredict::apply_frosting(object, components, new_data, ...) at cmu-delphi-epipredict-2dd9e70/R/epi_workflow.R:163:2
#>   8. │   └─epipredict:::apply_frosting.epi_workflow(...) at cmu-delphi-epipredict-2dd9e70/R/frosting.R:209:2
#>   9. │     ├─epipredict::slather(la, components, workflow, new_data) at cmu-delphi-epipredict-2dd9e70/R/frosting.R:265:6
#>  10. │     └─epipredict:::slather.layer_residual_quantiles(...) at cmu-delphi-epipredict-2dd9e70/R/layers.R:135:2
#>  11. │       └─dplyr::bind_cols(key_cols, r) %>% ... at cmu-delphi-epipredict-2dd9e70/R/layer_residual_quantiles.R:107:8
#>  12. ├─dplyr::select(., -time_value)
#>  13. ├─tibble::as_tibble(.)
#>  14. ├─dplyr::group_by(., !!!rlang::syms(common))
#>  15. └─dplyr:::group_by.data.frame(., !!!rlang::syms(common))
#>  16.   └─dplyr::group_by_prepare(.data, ..., .add = .add, error_call = current_env())
#>  17.     └─rlang::abort(bullets, call = error_call)
# invalid cols are accepted/ignored:
fcst2 <- flatline_forecaster(
  case_death_rate_subset, "death_rate",
  flatline_args_list(quantile_by_key = "nonexistent_column")
)
# quantile_reg + quantile_by_key is likely nonsensical, but accepted:
fcst3 <- arx_forecaster(
  case_death_rate_subset, "death_rate", c("death_rate"),
  trainer = quantile_reg(),
  args_list = arx_args_list(quantile_by_key = "geo_value")
)
#> Warning: The forecast_date is less than the most recent update date of the
#> data: forecast_date = 2021-12-31 while data is from 2022-05-31.
fcst4 <- arx_forecaster(
  case_death_rate_subset, "death_rate", c("death_rate"),
  trainer = quantile_reg(),
  args_list = arx_args_list(quantile_by_key = "nonexistent_column")
)
#> Warning: The forecast_date is less than the most recent update date of the
#> data: forecast_date = 2021-12-31 while data is from 2022-05-31.
# This successfully completes:
fcst5 <- arx_forecaster(
  case_death_rate_subset, "death_rate", c("death_rate"),
  args_list = arx_args_list(quantile_by_key = "geo_value")
)
#> Warning: Some grouping keys are not in data.frame returned by the
#> The forecast_date is less than the most recent update date of the data: forecast_date = 2021-12-31 while data is from 2022-05-31.
# But so does this:
fcst6 <- arx_forecaster(
  case_death_rate_subset, "death_rate", c("death_rate"),
  args_list = arx_args_list(quantile_by_key = "nonexistent_column")
)
#> Warning: Requested residual grouping key(s) {excess} are unavailable 
#> The forecast_date is less than the most recent update date of the data: forecast_date = 2021-12-31 while data is from 2022-05-31.

Created on 2023-08-29 with reprex v2.0.2

@brookslogan brookslogan added the bug Something isn't working label Aug 29, 2023
@brookslogan
Copy link
Contributor Author

I think my usage of quantile_by_key coupled with quantile_reg() is probably nonsensical. Is there a way to do grouped modeling within workflows, or does the grouping need to be done externally?

@brookslogan
Copy link
Contributor Author

Rounded out the tests above with a quantile_reg vs. default arx_forecaster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant