-
Notifications
You must be signed in to change notification settings - Fork 6
/
README.Rmd
91 lines (65 loc) · 2.91 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# geoarrow
<!-- badges: start -->
[![Codecov test coverage](https://codecov.io/gh/geoarrow/geoarrow-r/branch/main/graph/badge.svg)](https://app.codecov.io/gh/geoarrow/geoarrow-r?branch=main)
<!-- badges: end -->
The goal of geoarrow is to leverage the features of the [arrow](https://arrow.apache.org/docs/r/) package and larger [Apache Arrow](https://arrow.apache.org/) ecosystem for geospatial data. The geoarrow package provides an R implementation of the [GeoParquet](https://github.com/opengeospatial/geoparquet) file format of and the draft [geoarrow data specification](https://geoarrow.org), defining extension array types for vector geospatial data.
## Installation
You can install the released version of geoarrow from [CRAN](https://cran.r-project.org/) with:
``` r
install.packages("geoarrow")
```
You can install the development version of geoarrow from [GitHub](https://github.com/) with:
``` r
# install.packages("pak")
pak::pak("geoarrow/geoarrow-r")
```
## Example
The geoarrow package implements conversions to/from various geospatial types (e.g., sf, sfc, s2, wk) with various Arrow representations (e.g., arrow, nanoarrow). The most useful conversions are between the **arrow** and **sf** packages, which in most cases allow sf objects to be passed to **arrow** functions directly after `library(geoarrow)` or `requireNamespace("geoarrow")` has been called.
```{r example}
library(geoarrow)
library(arrow, warn.conflicts = FALSE)
library(sf)
nc <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
tf <- tempfile(fileext = ".parquet")
nc |>
tibble::as_tibble() |>
write_parquet(tf)
open_dataset(tf) |>
dplyr::filter(startsWith(NAME, "A")) |>
dplyr::select(NAME, geom) |>
st_as_sf()
```
By default, arrow objects are converted to a neutral wrapper around chunked Arrow memory, which in turn implements conversions to most spatial types:
```{r}
df <- read_parquet(tf)
df$geom
st_as_sfc(df$geom)
```
The entry point to creating arrays is `as_geoarrow_vctr()`:
```{r}
as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"))
```
By default these do not attempt to create a new storage type; however, you can request a storage type or infer one from the data:
```{r}
as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"), schema = geoarrow_native("POINT"))
vctr <- as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"))
as_geoarrow_vctr(vctr, schema = infer_geoarrow_schema(vctr))
```
There are a number of files to use as examples at <https://geoarrow.org/data> that can be read with `arrow::read_ipc_file()`:
```{r}
url <- "https://github.com/geoarrow/geoarrow-data/releases/download/v0.1.0/ns-water-basin_point.arrow"
tab <- read_ipc_file(url, as_data_frame = FALSE)
tab$geometry$type
```