-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bind_rows()
for database tables
#6905
Comments
I made this change in the source and it passes all bind_rows <- function(..., .id = NULL) {
dots <- list2(...)
is_flattenable <- function(x) !is_named(x)
if (length(dots) == 1 && is_bare_list(dots[[1]])) {
dots <- dots[[1]]
}
dots <- list_flatten(dots, fn = is_flattenable)
dots <- discard(dots, is.null)
if (length(dots) == 0) {
return(tibble())
}
UseMethod("bind_rows", dots[[1]])
}
#' @export
bind_rows.tbl_SQLiteConnection <- function(...) {
message("hello there")
}
#' @export
bind_rows.LazyFrame <- function(...) {
polars::pl$concat(..., how = "vertical")
}
#' @export
bind_rows.default <- function(..., .id = NULL) {
# Current code of `bind_rows()`
} Then, internally, each method could determine how to handle the Here, I check that library(dplyr, warn.conflicts = FALSE)
library(purrr)
library(polars)
### classic dplyr
bind_rows(tibble(mtcars), tibble(mtcars)) |>
head()
#> # A tibble: 6 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
### databases
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
tables <-
1:5 |>
map(function(n) {
tbl_name <- paste0("mtcars_", n)
copy_to(con, slice_sample(mtcars, n = 3), tbl_name)
tbl(con, tbl_name)
})
bind_rows(!!!tables)
#> hello there
### Polars LazyFrames
bind_rows(pl$LazyFrame(mtcars), pl$LazyFrame(mtcars))
#> [1] "polars LazyFrame naive plan: (run ldf$describe_optimized_plan() to see the optimized plan)"
#> UNION
#> PLAN 0:
#> DF ["mpg", "cyl", "disp", "hp"]; PROJECT */11 COLUMNS; SELECTION: "None"
#> PLAN 1:
#> DF ["mpg", "cyl", "disp", "hp"]; PROJECT */11 COLUMNS; SELECTION: "None"
#> END UNION Would this implementation be enough or am I missing something? |
@DavisVaughan it would be great to know whether my suggested implementation makes sense. If so, I can prepare a PR for this |
Any update on this one ? |
It would be nice if
bind_rows()
would work for database tables.Created on 2023-08-07 with reprex v2.0.2
I know there were issues with the double dispatch and what happens if local tables and database tables are mixed.
But I think it would be sufficient if either all inputs are local tables or database tables.
This was also requested again in dbplyr: tidyverse/dbplyr#1342.
The text was updated successfully, but these errors were encountered: