-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
As in the title.
It might be due to the fact that files starting with dots are hidden.
No issues if the dot appears elsewhere.
Reprex:
library(dplyr)
library(arrow)
packageVersion("arrow")
#> [1] '8.0.0'
path_arrow_tmp <- tempfile()
mtcars %>%
dplyr::group_by(cyl) %>%
arrow::write_dataset(
path = path_arrow_tmp
)
base::list.files(path_arrow_tmp, recursive = TRUE, all.files = TRUE)
#> [1] "cyl=4/part-0.parquet" "cyl=6/part-0.parquet" "cyl=8/part-0.parquet"
mtcars_load <- path_arrow_tmp %>%
arrow::open_dataset() %>%
dplyr::collect()
setequal(mtcars$mpg, mtcars_load$mpg)
#> [1] TRUE
# Change grouping by ".cyl"
path_arrow_tmp_grp <- tempfile()
mtcars %>%
dplyr::mutate(.cyl = cyl) %>%
dplyr::group_by(.cyl) %>%
arrow::write_dataset(
path = path_arrow_tmp_grp
)
# the files are there
base::list.files(path_arrow_tmp_grp, recursive = TRUE, all.files = TRUE)
#> [1] ".cyl=4/part-0.parquet" ".cyl=6/part-0.parquet" ".cyl=8/part-0.parquet"
# 0 files detected
path_arrow_tmp_grp %>%
arrow::open_dataset()
#> FileSystemDataset with 0 Parquet files
# Specify partitioning manually
# still no files
path_arrow_tmp_grp %>%
arrow::open_dataset(
partitioning = ".cyl",
hive_style = TRUE
)
#> FileSystemDataset with 0 Parquet files
#> .cyl: int32Environment: #> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 4.1.1 (2021-08-10)
#> os Windows 10 x64 (build 19044)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_Switzerland.1252
#> ctype C
#> tz Europe/Berlin
#> date 2022-06-02
#>
#> - Packages -------------------------------------------------------------------
#> package * version date (UTC) lib source
#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2)
#> cli 3.2.0 2022-02-14 [1] CRAN (R 4.1.3)
#> crayon 1.5.0 2022-02-14 [1] CRAN (R 4.1.1)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.1)
#> fansi 1.0.2 2022-01-14 [1] CRAN (R 4.1.2)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.1)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.2)
#> glue 1.6.1 2022-01-22 [1] CRAN (R 4.1.2)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.1.1)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1)
#> knitr 1.37 2021-12-16 [1] CRAN (R 4.1.2)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1)
#> magrittr 2.0.2 2022-01-26 [1] CRAN (R 4.1.2)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.1)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.1)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.1)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.1)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.1)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.1)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.3)
#> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.1)
#> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.1)
#> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.2)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.2)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.1.1)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3)
#> xfun 0.29 2021-12-14 [1] CRAN (R 4.1.2)
#> yaml 2.2.2 2022-01-25 [1] CRAN (R 4.1.2)
Reporter: Lorenzo Gaborini
Related issues:
- [R] Expose FileSystemFactoryOptions (is fixed by)
Note: This issue was originally created as ARROW-16720. Please see the migration documentation for further details.