291 date period #297

dajmcdon · 2024-03-18T18:41:56Z

Checklist

Please:

Make sure this PR is against "dev", not "main".
Request a review from one of the current epipredict main reviewers:
dajmcdon.
Makes sure to bump the version number in DESCRIPTION and NEWS.md.
Always increment the patch version number (the third number), unless you are
making a release PR from dev to main, in which case increment the minor
version number (the second number).
Describe changes made in NEWS.md, making sure breaking changes
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
0.7.2, then write your changes under the 0.8 heading).

Change explanations for reviewer

This should allow for weekly and annual forecasts.

Ahead/lag always operate on the time_type of the epi_df. But forecast_date / target_date always were dates. These can clash in non-daily forecast tasks.
We now validate these against the time_type of the epi_df. This should now work naturally. There's a test case for annual forecasts in the panel-data vignette in Panel Data Vignette (Issue 99) #115 .
While here, there were a few minor things I fixed:
- prefix head and tail with utils:: to avoid R CMD Check notes.
- switch a few rlang::abort() to cli::cli_abort()
- layer_population_scaling() was producing messages because the by argument could be empty
- Note the tolower() processing that is commented out
- lag/ahead are coerced to integer silently
- In some tests, prep() needed to have the training data.

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Resolves Date processing mismatch #291

dajmcdon · 2024-03-18T18:57:40Z

R/layer_population_scaling.R

+    # object$df <- object$df %>%
+    #  dplyr::mutate(dplyr::across(tidyselect::where(is.character), tolower))


I think this is wrong, but I'm not yet certain that it's safe to remove.

doing some timetravel, it looks like it's been here in one form or another forever. What's confusing to me is that this looks like it does nothing. Was the point to make all the characters columns lowercase (such as geo_value) to match correctly maybe, but there was a parenthesis problem? Was dplyr::mutate(dplyr::across(tidyselect::where(is.character)), tolower) meant?

Yeah, I'm not sure why it was there in the first place. I suspect that the included state population dataset used capitals while typical usage from the API gives geos in lower case. But that shouldn't have resulted in hardcoded workarounds here (that are prone to failure).

So I think this should go forever, but I wanted to be sure that if some downstream use errored out, I could find this and try to track it more carefully.

dsweber2

Generally looks good to me. Might want to slightly restructure the validate_date error to reduce on redundancy and improve readability, but that's not really a huge change.

I tried to clarify some of the tests a little bit, since I was struggling to follow them, but its nothing essential.
If I were to suggest more tests, it would be a for loop over pairs of types for layer_add_forecast_date() and latest, with the expectation that all but the matching ones would fail. Probably overkill.

I think it makes sense to merge this before #296, as that also involves date types and working with the ahead.

R/layer_add_forecast_date.R

R/layer_add_target_date.R

dsweber2 · 2024-03-18T21:51:31Z

R/layer_population_scaling.R

+    # object$df <- object$df %>%
+    #  dplyr::mutate(dplyr::across(tidyselect::where(is.character), tolower))


doing some timetravel, it looks like it's been here in one form or another forever. What's confusing to me is that this looks like it does nothing. Was the point to make all the characters columns lowercase (such as geo_value) to match correctly maybe, but there was a parenthesis problem? Was dplyr::mutate(dplyr::across(tidyselect::where(is.character)), tolower) meant?

dsweber2 · 2024-03-18T22:04:31Z

tests/testthat/test-layer_add_forecast_date.R

@@ -11,8 +11,9 @@ latest <- jhu %>%

 test_that("layer validation works", {
  f <- frosting()
-  expect_error(layer_add_forecast_date(f, "a"))


Is this ok'd so we can add months? e.g. "May"?

It's not that this is ok, but rather that the layer_add_*() function can no longer validate the date format immediately. It has to happen later.

tests/testthat/test-layer_add_forecast_date.R

tests/testthat/test-layer_add_target_date.R

dshemetov

Read most of the code, but didn't have enough context to fully follow it. Dropped a few thoughts for now, hopefully they're helpful and not distracting.

dshemetov · 2024-03-22T23:43:12Z

R/layer_add_target_date.R

    }
+    forecast_date <- coerce_time_type(possible_fd, expected_time_type)
+    ahead <- extract_argument(the_recipe, "step_epi_ahead", "ahead")
+    target_date <- forecast_date + ahead


Does having this go from max_time_value + ahead to forecast_date + ahead cause this to function to behave differently? If yes and testing this PR becomes tough, then we could reduce scope here and punt these changes to another PR.

Separate question: how do the units in ahead change depending on the time type? Is this primarily up to the user to make sure they specify aheads in the right units?

On max_time_value + ahead, this was the default previously, and remains the fall back (see lines below). However, if there is a forecast_date, then I think the default for target_date should be to use the specified forecast_date first.

This PR is meant to allow for ahead to always be an integer. Then it should be in units relative to the time_type of the epi_df. It took a lot of effort to make that happen though.

Co-authored-by: David Weber <david.weber2@pm.me>

…edict into 291-date-period

dajmcdon and others added 17 commits March 7, 2024 15:27

add period utils

dff767e

ensure joins happen silently

0667845

fix time_type processing in forecast/target date

edd5134

import tsibble

e3b2907

use panel_data vignette for experiments, to be removed

a624b63

remove browser()

69b21c6

prefix with utils to avoid check warning

3498fcf

redocument

b2b8134

ahead/lag will always be integer type now

f2f39a2

we no longer warn if as_of > forecast_date

f9859fe

upstream changes to prep() no require data

fb7faae

remove unused test

78ff1f9

always a scalar

3d059f4

add some additional tests

de9e00a

this shouldnt be on this branch

0c38723

abort message typo

e036b8d

bump version, add to news

e6e60cd

dajmcdon linked an issue Mar 18, 2024 that may be closed by this pull request

Date processing mismatch #291

Closed

dajmcdon commented Mar 18, 2024

View reviewed changes

dajmcdon requested a review from dsweber2 March 18, 2024 19:00

dajmcdon marked this pull request as ready for review March 18, 2024 19:00

run styler

845c1a9

dsweber2 approved these changes Mar 19, 2024

View reviewed changes

dshemetov self-requested a review March 21, 2024 23:55

dshemetov reviewed Mar 23, 2024

View reviewed changes

dajmcdon and others added 5 commits March 27, 2024 11:12

Apply @dsweber2 suggestions from code review

8feef06

Co-authored-by: David Weber <david.weber2@pm.me>

add back omitted test, include some comments

6d5c379

validate scalar forecast_date

047caa6

Merge branch '291-date-period' of https://github.com/cmu-delphi/epipr…

cfde0c9

…edict into 291-date-period

symmetrize scalar validation

fc05cd9

dajmcdon added 3 commits March 27, 2024 11:18

add uncommitted stuff from code review

1e76059

clarify some test purposes

f5ef88b

refactor validate_date() to avoid duplication

c1e3ff9

dajmcdon merged commit c94d5f9 into dev Mar 27, 2024
3 checks passed

dajmcdon deleted the 291-date-period branch March 27, 2024 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

291 date period #297

291 date period #297

dajmcdon commented Mar 18, 2024 •

edited

Loading

dajmcdon Mar 18, 2024

dsweber2 Mar 18, 2024

dajmcdon Mar 27, 2024

dsweber2 left a comment

dsweber2 Mar 18, 2024

dsweber2 Mar 18, 2024

dajmcdon Mar 27, 2024

dshemetov left a comment

dshemetov Mar 22, 2024

dshemetov Mar 22, 2024

dajmcdon Mar 27, 2024

		# object$df <- object$df %>%
		# dplyr::mutate(dplyr::across(tidyselect::where(is.character), tolower))

291 date period #297

291 date period #297

Conversation

dajmcdon commented Mar 18, 2024 • edited Loading

Checklist

Change explanations for reviewer

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsweber2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dshemetov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dajmcdon commented Mar 18, 2024 •

edited

Loading