Skip to content

[R] dplyr::compute should convert from grouped arrow_dplyr_query to arrow Table #32973

@asfimport

Description

@asfimport

It is expected that dplyr::compute() will perform the calculation on the arrow dplyr query and convert it to a Table, but it does not seem to work correctly for grouped arrow dplyr queries and does not result in a Table.

mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
#> [1] "arrow_dplyr_query"
mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::ungroup() |> dplyr::compute() |> class()
#> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"

as_arrow_table() works fine.

mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> class()
#> [1] "arrow_dplyr_query"
mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
#> [1] "arrow_dplyr_query"
mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::collect(FALSE) |> class()
#> [1] "arrow_dplyr_query"
mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> arrow::as_arrow_table() |> class()
#> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"

It seems to revert to arrow dplyr query in the following line.

df <- as_adq(df)
df$group_by_vars <- query$group_by_vars
df$drop_empty_groups <- query$drop_empty_groups

 

Reporter: SHIMA Tatsuya / @eitsupi
Assignee: SHIMA Tatsuya / @eitsupi

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-17738. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions