Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements drop_na() #194

Merged
merged 9 commits into from
Feb 21, 2021
Merged

Implements drop_na() #194

merged 9 commits into from
Feb 21, 2021

Conversation

markfairbanks
Copy link
Collaborator

No description provided.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

NEWS.md Outdated
with `pivot_wider()` (@markfairbanks, #189).
with:

* `drop_na()` (@markfairbanks, #194)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtplyr just made it to CRAN so once you merge/rebase, you'll need to move this to a new bullet in the dev version.

R/step-call.R Outdated
drop_na(data, ...)
}

globalVariables("drop_na")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can instead add to dt_funs in tidyeval.R

Copy link
Collaborator Author

@markfairbanks markfairbanks Feb 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't dt_funs just for data.table functions? From what I could tell na.omit() doesn't need to be added to dt_funs since data.table is just exporting a method for stats::na.omit().

The globalVariables("drop_na") was to pass R CMD check - it throws this error otherwise:

drop_na.data.table: no visible global function definition fordrop_naUndefined global functions or variables:
    drop_na

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I totally misinterpreted the problem. You should use tidyr::drop_na() in the drop_na method.

Comment on lines 168 to 173
df <- lazy_dt(tibble(x = c(1, 2, NA), y = c("a", NA, "b")), "DT")
exp <- tibble(x = c(1), y = c("a"))
step <- drop_na(df)
res <- as_tibble(step)
expect_equal(show_query(step), expr(na.omit(DT)))
expect_equal(res, exp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a little easier to follow:

Suggested change
df <- lazy_dt(tibble(x = c(1, 2, NA), y = c("a", NA, "b")), "DT")
exp <- tibble(x = c(1), y = c("a"))
step <- drop_na(df)
res <- as_tibble(step)
expect_equal(show_query(step), expr(na.omit(DT)))
expect_equal(res, exp)
tb <- tibble(x = c(1, 2, NA), y = c("a", NA, "b")
dt <- drop_na(lazy_dt(tb, "DT"))
expect_equal(show_query(dt), expr(na.omit(DT)))
expect_equal(as_tibble(step), tb[1, ])

expect_equal(res, exp)
})

test_that("groups are preserved", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to test this? It's step_call() that takes care of this, not your code.

expect_equal(dplyr::group_vars(res), dplyr::group_vars(gexp))
})

test_that("empty call drops every row", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate of first test?

Comment on lines 213 to 214
expect_error(collect(drop_na(df, !! list())))
expect_error(collect(drop_na(df, "z")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be expect_snapshot(error = TRUE, ...) so we can see the precise errors

Comment on lines 217 to 229
test_that("single variable data.frame doesn't lose dimension", {
df <- lazy_dt(data.frame(x = c(1, 2, NA)))
res <- as.data.frame(drop_na(df, "x"))
exp <- data.frame(x = c(1, 2))
expect_equal(res, exp)
})

test_that("preserves attributes", {
df <- lazy_dt(tibble(x = structure(c(1, NA), attr = "!")))
rs <- collect(drop_na(df))

expect_equal(rs$x, structure(1, attr = "!"))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are data.frame specific so can be dropped.

Merge remote-tracking branch 'upstream/master' into drop_na
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
test_that("errors are raised", {
tb <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
dt <- lazy_dt(tb, "DT")
expect_snapshot(collect(drop_na(dt, !! list())), error = TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is supposed to test, and it looks like the !! is being evaluated too early. So I suggest just dropping it.

@markfairbanks
Copy link
Collaborator Author

All set for review

@hadley hadley merged commit 60eb063 into tidyverse:master Feb 21, 2021
@hadley
Copy link
Member

hadley commented Feb 21, 2021

Thanks!

@markfairbanks markfairbanks deleted the drop_na branch February 22, 2021 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants