Skip to content

Check for amount of training data #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dajmcdon opened this issue Jul 20, 2022 · 2 comments
Closed

Check for amount of training data #106

dajmcdon opened this issue Jul 20, 2022 · 2 comments
Assignees
Labels
P1 high priority

Comments

@dajmcdon
Copy link
Contributor

dajmcdon commented Jul 20, 2022

This is related to #36 and #53.

We should add a check_training_nobs() or some such name.

Desired behaviour:

  • Examine the recipe. At bake time, see if we have at least nobs without NAs.
  • If we don't have enough data, do something to predict() so that it outputs the right number of rows for the target as well as the right columns, but with all NA predictions. So if your frosting creates a .pred and a .pred_distn column, those would exist, but contain NAs.

A complication here is that, if everything above works, calling layer_naomit() would result in an empty epi_df. This should NOT happen.

See also #107 .

@dsweber2
Copy link
Contributor

dsweber2 commented Aug 5, 2023

Just want to note that figuring out that this was the problem I was having took ~a day. The specific case I hit was no training obs that were ahead days ahead while also having the max of lags days behind (caused by epix_slide rounding down rather than up in the number of days it gave us, but that's a topic for somewhere else). Would definitely be useful!

@dshemetov
Copy link
Contributor

Closed by #283. Can reopen if we find unaddressed edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 high priority
Projects
None yet
Development

No branches or pull requests

3 participants