Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fill_NA behaviour in prepare_data() #100

Merged
merged 6 commits into from
Nov 8, 2023
Merged

Conversation

pratikunterwegs
Copy link
Collaborator

This PR makes some small changes to prepare_data() to fix #82, and to allow mocked testing of an error message when {incidence2} is not installed.

R/prepare_data.R Outdated Show resolved Hide resolved
R/prepare_data.R Outdated Show resolved Hide resolved
Copy link
Member

@joshwlambert joshwlambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I especially like the use of mock testing.

Changes look good to me. Would also be good to get a quick look from @jamesmbaazam as he is quoted in the issue being resolved.

Another point, not about the code, it would be good to have an informative PR title, even though it relates back to the issue which contains the details, having it also be self-describing is useful for time saving.

@pratikunterwegs pratikunterwegs changed the title Fix #82 Change fill_NA behaviour in prepare_data() Nov 8, 2023
@joshwlambert
Copy link
Member

Taking a closer look at the code around the diffs in this PR and building on what I interpret from issue #82, I'm wondering what role the fill_NA argument has in the function, and wonder whether it should be removed. The prepare_date() function is titled: "Prepare common epidemiological data formats for CFR estimation", but if fill_NA = FALSE then the data is not prepared for a CFR estimation function (as per L88 of the vignette). The default behaviour could be to check if NAs are present, and if they are found replace with 0 and only with replacement give a message or warning to the user. The only justification for using prepare_data() with fill_NA = FALSE is if it were part of tidyversesque pipeline that didn't terminate in a CFR calculation, but that seems unlikely.

@jamesmbaazam
Copy link
Member

jamesmbaazam commented Nov 8, 2023

I think prepare_data() here has a similar goal as EpiNow2::create_clean_reported_cases() in that it replaces implicit missing dates with 0 by default and only gives the user some flexibility with the zero_frequency argument. It's a similar thing I raise in #82 that fill_NA should be removed.

@pratikunterwegs
Copy link
Collaborator Author

Thanks for taking a look. The idea with this function is primarily to help with the aggregation and pivoting of incidence2 objects. People fall into two camps with zero filling, and I'm trying to cater to both. Users who are certain that their data are robust to zero filling can stick with the default option. Users who would prefer to be conservative and retain NAs can do that, although they would require intermediate steps (removing NAs and splitting into contiguous date blocks) before passing the data to cfr_*().

@pratikunterwegs
Copy link
Collaborator Author

Thanks both for taking a look, this will be merged once checks pass.

@pratikunterwegs pratikunterwegs merged commit d21e3de into main Nov 8, 2023
9 checks passed
@pratikunterwegs pratikunterwegs deleted the dev/fixes branch November 8, 2023 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fill NAs with 0s by default in prepare_data.incidence2
3 participants