forked from geocompx/geocompr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
10-gis.Rmd
492 lines (403 loc) · 32.8 KB
/
10-gis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
# Bridges to GIS software {#gis}
An important feature of R is that users must use the command-line interface (CLI) to interact with the computer:
you type commands and hit `Enter` to execute them interactively.
The most popular open-source GIS software packages, by contrast, feature a prominent graphical user interface (GUI):
you *can* interact with QGIS, SAGA and gvSIG by typing into (dockable) command-lines, but most users interact with such programs by 'pointing and clicking their way through life' as Gary Sherman puts it below [@sherman_desktop_2008]:^[
<!-- yes, we should shorten the footnote or put it somewhere into the text. I just rewrote it to make clearer what was meant. At least this was what I gathered. -->
The mature GRASS GIS software package and PostGIS are quite popular in academia and industry, and could be seen as products which buck this trend as they are built around the command-line.
In [2008](http://gama.fsv.cvut.cz/~landa/publications/2008/gis-ostrava-08/paper/landa-grass-gui-wxpython.pdf) GRASS developers added a sophisticated GUI, shifting the emphasis slightly away from its CLI.
However, GRASS lacks a sophisticated and feature-rich IDE such as RStudio that supports 'CLI newbies'.
On the other hand, PostGIS is the spatial extension of the popular PostgreSQL open source database, and therefore not really a dedicated GIS.
This is also highlighted by the fact that PostGIS lacks any geovisualization capabilities and its description of 'legacy GIS' on its [website](http://workshops.boundlessgeo.com/postgis-intro/introduction.html).
Similar to GRASS, PostgreSQGL provides a (partial) GUI called [pgAdmin](https://www.pgadmin.org/) to facilitate the editing and administration of the database.
To make it clear, though PostGIS provides spatial functions, its main purpose is the handling of spatial objects in a relational database management system.
Therefore, frequently users store their spatial data in PostGIS, and interact with it through a dedicated GIS software such as QGIS. Of course, you can also use R to access data from PostGIS (**sf**, **rgdal**, **rpostgis**).
In summary, a typical workflow would be: perform a large spatial query with PostGIS, then load the result into a further application (QGIS, R) for further geoprocessing.
]
> With the advent of 'modern' GIS software, most people want to point and
click their way through life. That’s good, but there is a tremendous amount
of flexibility and power waiting for you with the command line. Many times
you can do something on the command line in a fraction of the time you
can do it with a GUI.
Gary Sherman is well-qualified to comment on this matter as the creator of QGIS, the world's premier open source GUI-based GIS!
The 'CLI vs GUI' debate is often adversarial and polarized but it does not have to be: both options are great if well chosen in accordance with your needs and tasks.
The advantages of a good CLI such as that provided by R are numerous. Among others, a CLI:
- Facilitates the automation of repetitive tasks. We strongly dislike the manual execution of iterations, and would recommend to always think about how to solve such tasks programmatically.
This way, you most likely discover new programming features and concepts, i.e., you learn and enhance your skill set.
By contrast, what are you learning from executing the same tasks a 1000 times?
As a nice side-effect, automation is surely more effective and less error-prone.
- Ensures transparency and reproducibility (which also is the backbone of good scientific practice).
In short, it is easier to share your R script than explain a sequence of 10+ 'points and clicks'.
- Encourages extending existing software by making it easy to modify and extend functions.
If you are missing an algorithm you need, the CLI gives you the freedom to write your own!
<!-- add link to most sexiest job in the 21st century -->
- Is the way to (geo-)data science.
Professional and advanced technical skills will certainly enhance your career prospects, and are in dire need across a wide range of disciplines.
- Is fun, but admittedly that is a subjective argument.
<!-- any other points that we have missed? -->
- ...
On the other hand, GUI-based GIS systems (particularly QGIS) are also advantageous.
For instance, think of:
- The GUI. The really user-friendly graphical interface spares the user from programming.
Though you probably wouldn't read the book if this were your main objective.
- Digitizing and all related tools (trace, snap, topological rules, etc.). Note that there is also the new **mapedit** package but its intention is to allow the quick editing of a few spatial features, and not professional, large-scale cartographic digitizing.
- Georeferencing
- Stereoscopic mapping (e.g., LiDAR and structure from motion)
- The built-in geodatabase management system often integrated in Desktop GIS (ArcGIS, GRASS GIS) and all related advantages such as object relational modeling, topology, fast (spatial) querying, etc.
- Map production, in case you only want to create a beautiful map once. If you have to produce it over and over again, then maybe the CLI approach is better.
- Zooming and dragging on WMS (though this is also possible with **mapview** and **leaflet**)
<!-- any other points that we have missed so far? -->
- ...
This general overview already points out the differences between R's CLI and desktop GIS. However, there is more: dedicated GIS software provides hundreds of geoalgorithms that are simply missing in R.
The good news is that 'GIS bridges' enable the access to these with the comfort of the R command line.^[
The term 'bridge' was probably first used in the R-spatial world for the coupling of R with GRASS [@neteler_open_2008].
Roger Bivand elaborates on this in his talk "Bridges between GIS and R", delivered at the 2016 GEOSTAT summer school.
The resulting slides can be found on Roger's personal website at [http://spatial.nhh.no/misc](http://spatial.nhh.no/misc/?C=M;O=D) in the file
`geostat_talk16.zip`.
]
The R language was originally designed as an interface to and extension of other languages, especially C and FORTRAN, to enable access to statistical algorithms in a user-friendly and intuitive read-evaluate-print loop (REPL) [@chambers_extending_2016].
R was not originally intended to be a GIS.
This makes the breadth of R's geospatial capabilities astonishing to many who are unaware of its ability to replicate established GIS software for many operations on spatial data.
There are some domains where R can now outperform desktop GIS including spatial statistical modeling, online interactive visualization and the generation of animated or faceted maps.
Instead of implementing existing GIS algorithms in R, it makes sense to avoid 'reinventing the wheel' by taking advantage of R's ability to interface with other languages (especially C++, which is used for much low-level and high-performance GIS work).
Using compiled code for new geoprocessing functionality (particularly with the help of the excellent **Rcpp** package) could form the basis of new R packages, building on the success of **sf**.
However, there is already a wide range of algorithms that can be accessed via R's interfaces to dedicated GIS software.
It makes sense to understand these before moving to develop your own optimized algorithms.
For this reason this chapter focuses on 'bridges' to the mature GIS products [QGIS](http://qgis.org/) (via the package **RQGIS**), [SAGA](http://www.saga-gis.org/) (**RSAGA**) and [GRASS](https://grass.osgeo.org/) (**rgrass7**) from within R.
Obviously, we here focus on open-source software solutions, however, there is also a bridge to the commercial GIS leader [ArcGIS](https://www.arcgis.com) through the **RPyGeo** package.
And the so-called [R-ArcGIS bridge](https://github.com/R-ArcGIS/r-bridge) allows to use R from within ArcGIS.
As a final note, we would like to point out that aside from interfaces to desktop GIS there are also interfaces to geospatial libraries such as [GDAL](www.gdal.org) (**gdalUtils**, **rgdal**, **sf**) and [GEOS](https://trac.osgeo.org/geos/) (**rgeos**, **sf**).
By the end of the chapter you should have a working knowledge of the functionality such packages open up, and a more nuanced understanding of the 'CLI vs GUI' debate.
As mentioned in chapter \@ref(intro), doing GIS at the command-line makes it more reproducible, in-line with the principles of Geographic Data Science.
## (R)QGIS
QGIS is one of the most popular open-source GIS [Table \@ref(tab:gis-comp); @graser_processing:_2015]. Its main advantage lies in the fact that it provides a unified interface to several other open-source GIS.
```{r gis-comp, echo=FALSE, message=FALSE}
library(tidyverse)
d = tibble("GIS" = c("GRASS", "QGIS", "SAGA"),
"first release" = c("1984", "2002", "2004"),
"no. functions" = c(">500", ">1000", ">600"),
"support" = c("hybrid", "hybrid", "hybrid"))
knitr::kable(x = d, caption = "Comparison between three open-source GIS. Hybrid refers to the support of vector and raster operations.") %>%
kableExtra::add_footnote(label = "Comparing downloads of different providers is rather difficult (see [http://spatialgalaxy.net/2011/12/19/qgis-users-around-the-world/](http://spatialgalaxy.net/2011/12/19/qgis-users-around-the-world/)), and here also useless since every Windows QGIS download automatically also downloads SAGA and GRASS.", notation = "alphabet")
```
First and foremost, this means that you have access to GDAL/OGR, GRASS and SAGA through QGIS but also to other third-party providers such as [TauDEM](http://hydrology.usu.edu/taudem/taudem5/index.html), [Orfeo Toolbox](https://www.orfeo-toolbox.org/) and [Lastools](https://rapidlasso.com/lastools/) (tools for LiDAR data) [@graser_processing:_2015].
To run all these geoalgorithms (frequently more than 1000 depending on your set up) outside of the QGIS GUI, QGIS provides a Python API.
**RQGIS** establishes a tunnel to this Python API through the **reticulate** package.
Basically, functions `set_env` and `open_app` are doing this.
Note that it is optional to run `set_env` and `open_app` since all functions depending on their output will run them automatically if needed.
Before running **RQGIS** you have to make sure to have installed QGIS and all its (third-party) dependencies such as SAGA and GRASS.
To help you with the installation process, please follow the steps as detailed in `vignette("install_guide", package = "RQGIS")` for several platforms (Windows, Linux, MacOS).
```{r qgis_setup, eval=FALSE}
library(RQGIS)
set_env()
open_app()
```
Leaving the `path`-argument of `set_env` unspecified will search the computer for a QGIS installation.
Hence, it is faster to specify explicitly the path to your QGIS installation.
Subsequently, `open_app` sets all paths necessary to run QGIS from within R, and finally creates a so-called QGIS custom application [http://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html#using-pyqgis-in-custom-applications](http://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html#using-pyqgis-in-custom-applications).
We are now ready for some QGIS geoprocessing from within R! First of all, we load some data from the **spData**-package, namely the boroughs of London (`lnd`) and cycle hire points in London (`cycle_hire`).
```{r}
library(spData)
```
In chapter \@ref(spatial-class), we already learned how to do a spatial overlay using the **sf**-package.
Of course, any GIS is also able to perform spatial overlays. Here, we would like to know how many cycle points we can find per borough.
First of all, we need to come up with the name of the function in QGIS. `find_algorithms` lets you search all QGIS geoalgorithms with the help of regular expressions.
Here, we assume that the short description of the function contains first the word "point" and secondly somewhere later also the word "poly".
If you have no clue at all what to look for you can leave the `search_term`-argument empty which will return a list of all available QGIS geoalgorithms.
If you also want to have a short description for each geoalgorithm, set the `name_only`-parameter to FALSE.
```{r, eval=FALSE}
find_algorithms("points.*poly", name_only = TRUE)
```
Now that we know the name of the function (`qgis:countpointsinpolygon`), we wonder how we can use it.
`get_usage` returns all function parameters and default values.
`open_help` lets you access the corresponding online help.
```{r, eval=FALSE}
alg = "qgis:countpointsinpolygon"
get_usage(alg)
```
```{r, eval=FALSE}
open_help(alg)
```
Finally, we can let QGIS do the work.
Note that the workhorse function `run_qgis` accepts R named arguments, i.e., you can specify the parameter names as returned by `get_usage` as you would do in any other regular R function.
Note also that `run_qgis` accepts spatial objects residing in R's global environment as input (here: `lnd` and `cycle_hire`).
But of course, you could also specify paths to shapefiles stored on disk.
Setting the `load_output` to `TRUE` automatically loads the QGIS output into R.
Since we only indicated the name of the output ("cycle.shp"), `run_qgis` saves the output to a temporary folder as returned by `tempdir()`, and loads it into R as an **sf**-object.
```{r, eval=FALSE}
bike_points = run_qgis(alg, POLYGONS = lnd, POINTS = cycle_hire, FIELD = "no_bikes",
OUTPUT = "cycle.shp", load_output = TRUE)
summary(bike_points$no_bikes)
sum(bike_points$no_bikes > 0)
```
In case you leave some parameters of a geoalgorithm unspecified, `run_qgis` will automatically use the default values as arguments if available.
To find out about the default values, run `get_args_man`.
```{r, eval=FALSE}
get_args_man(alg)
```
In this case the output tells us, had we left the `FIELDS`-parameter unspecified, our output (attribute) field would have been named "NUMPOINTS" (instead of "no_bikes").
<!--
"grass7:v.vect.stats" would achieve the same but is unavailable in QGIS
-->
Other notes:
- Leaving the output parameter(s) unspecified, saves the resulting QGIS output to a temporary folder created by QGIS.
`run_qgis` prints these paths to the console after successfully running the QGIS engine.
- If the output consists of multiple files and you have set `load_output` to `TRUE`, `run_qgis` will return a list with each element corresponding to one output file.
To learn more about **RQGIS** please refer to the (hopefully) forthcoming paper (cite).
## (R)SAGA
Similar to QGIS, the System for Automated Geoscientific Analyses (SAGA; Table \@ref(tab:gis-comp)) provides the possibility to run SAGA modules from Python (SAGA Python API).
In addition, there is also a command line interface (saga_cmd.exe) to execute SAGA modules (see also [https://sourceforge.net/p/saga-gis/wiki/Executing%20Modules%20with%20SAGA%20CMD/](https://sourceforge.net/p/saga-gis/wiki/Executing%20Modules%20with%20SAGA%20CMD/)).
**RSAGA** uses the latter to run SAGA from within R.
Though SAGA is a hybrid GIS, its main focus has been on raster processing, and here particularly on digital elevation models (soil properties, terrain attributes, climate parameters).
Hence, SAGA is especially good at the fast processing of large (high-resolution) rasters datasets [@conrad_system_2015].
Therefore, we will introduce **RSAGA** with a raster and use case from @muenchow_geomorphic_2012.
Specifically, we would like to compute the SAGA wetness index from a digital elevation model.
First of all, we need to make sure that **RSAGA** will find SAGA on the computer when called.
For this, all **RSAGA** functions using SAGA in the background make use of `rsaga.env()`.
Usually, `rsaga.env()` will detect SAGA automatically by searching several likely directories (see its help for more information). However, we have 'hidden' SAGA in the OSGEO4W-installation, a location `rsaga.env()` does not search automatically.
`linkSAGA` searches your computer for a valid SAGA installation.
If it finds one, it adds the newest version to the PATH environment variable thereby making sure that `rsaga.env` runs successfully.
```{r, warning=FALSE, message=FALSE, eval=FALSE}
library(RSAGA)
library(link2GI)
saga = linkSAGA()
rsaga.env()
```
Secondly, we need to write the digital elevation model to a SAGA-format.
Note that calling `data(landslides)` attaches two object to the global environment - `dem`, a digital elevation model in the form of a `list`, and `landslides`, a `data.frame` containing observations representing the presence or absence of a landslide:
<!-- The following examples only work with SAGA < 2.3. We have informed the package maintainer to update SAGA -->
```{r, eval=FALSE}
data(landslides)
write.sgrd(data = dem, file = file.path(tempdir(), "dem"), header = dem$header)
```
The organization of SAGA is modular.
Libraries contain so-called modules, i.e., geoalgorithms.
To find out which libraries are available, run:
```{r, eval=FALSE}
tail(rsaga.get.libraries(), 10)
```
Instead of presenting all available libraries we have shown only the last ten.
We choose the library `ta_hydrology` (`ta` is the abbreviation for terrain analysis).
Subsequently, we can access the available modules of a specific library (here: `ta_hydrology`) as follows:
```{r, eval=FALSE}
rsaga.get.modules(libs = "ta_hydrology")
```
Similarly to `RQGIS::get_usage()`, `rsaga.get.usage()` prints the function parameters of a specific geoalgorithm, e.g., the `SAGA Wetness Index`, to the console.
```{r, eval=FALSE}
rsaga.get.usage(lib = "ta_hydrology", module = "SAGA Wetness Index")
```
Finally, you can run SAGA from within R using **RSAGA**'s geoprocessing workhorse function `rsaga.geoprocessor()`.
The function expects a parameter-argument list in which you have specified all necessary parameters.
```{r, eval=FALSE}
params = list(DEM = file.path(tempdir(), "dem.sgrd"),
TWI = file.path(tempdir(), "twi.sdat"))
rsaga.geoprocessor(lib = "ta_hydrology", module = "SAGA Wetness Index",
param = params)
```
To facilitate the access to the SAGA interface, RSAGA frequently provides user-friendly wrapper-functions with meaningful default values (see RSAGA documentation for examples, e.g., `?rsaga.wetness.index`).
So the function call for calculating the 'SAGA Wetness Index' becomes as simple as:
```{r, eval=FALSE}
rsaga.wetness.index(in.dem = file.path(tempdir(), "dem"),
out.wetness.index = file.path(tempdir(), "twi"))
```
Of course, we would like to inspect our result visually (Figure \@ref(fig:saga-twi)). To load and plot the SAGA output file, we use the **raster** package.
```{r saga-twi, fig.cap="SAGA wetness index of Mount Mongón, Peru.", echo=FALSE}
knitr::include_graphics("https://user-images.githubusercontent.com/1825120/39205055-8e68a3ce-47f1-11e8-8874-0142d7f591e2.png")
```
```{r, eval=FALSE}
library(raster)
twi = raster::raster(file.path(tempdir(), "twi.sdat"))
plot(twi, col = RColorBrewer::brewer.pal(n = 9, name = "Blues"))
# or using mapview
# proj4string(twi) = paste0("+proj=utm +zone=17 +south +ellps=WGS84 +towgs84=",
# "0,0,0,0,0,0,0 +units=m +no_defs")
# mapview(twi, col.regions = RColorBrewer::brewer.pal(n = 9, "Blues"),
# at = seq(cellStats(twi, "min") - 0.01, cellStats(twi, "max") + 0.01,
# length.out = 9))
```
You can find a much more extended version of the here presented example in the RSAGA vignette `vignette("RSAGA-landslides")`.
This example includes statistical geocomputing, i.e., it uses a GIS to derive terrain attributes as predictors for a non-linear Generalized Linear Model (GAM) to predict spatially landslide susceptibility [@muenchow_geomorphic_2012].
The term statistical geocomputation emphasizes the strength of combining R's data science power with the geoprocessing power of a GIS which is at the very heart of building a bridge from R to GIS.
## GRASS through **rgrass7**
The U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created the core of the hybrid raster-vector Geographical Resources Analysis Support System (GRASS) [Table \@ref(tab:gis-comp); @neteler_open_2008] from 1982 to 1995.
Academia continued this work since 1997.
Similar to SAGA, GRASS focussed on raster processing in the beginning while only later, since GRASS 6.0, adding advanced vector functionality [@bivand_applied_2013].
We will introduce **rgrass7** with one of the most interesting problems in GIScience - the traveling salesman problem.
Suppose a traveling salesman would like to visit 24 customers while covering the shortest distance possible.
Additionally, the salesman would like to set out his journey at home, and come back to it after having finished all customer visits.
There is a single best solution to this problem, however, to find it, is even for modern computers (mostly) impossible [@longley_geographic_2015].
In our case, the number of possible solution correspond to `(25 - 1)! / 2` possible solutions, i.e. the factorial of 24 divided by 2 (since we do not differentiate between forward or backward direction).
Even if one iteration can be done in a nanosecond this still corresponds to `r (factorial(25 - 1) / (2 * 10^9 * 3600 * 24 * 365))` years.
Luckily, there are clever, almost optimal solutions which run in a tiny fraction of this inconceibable amount of time.
GRASS GIS provides one of these solutions (for more details see [v.net.salesman](https://grass.osgeo.org/grass72/manuals/v.net.salesman.html)).
In our use case, we would like to find the shortest path between the first 25 bicycle stations (instead of customers) on London's streets.
```{r}
library(spData)
data(cycle_hire)
points = cycle_hire[1:25, ]
```
<!--
```{r}
# plot(st_geometry(streets))
# plot(st_geometry(cycle_hire), col = "red", pch = 16, add = TRUE)
```
-->
Aside from the cycle hire points data, we will need the OpenStreetMap data of London.
We download it with the help of the **osmdata** package (see also section \@ref(retrieving-data)).
We constrain the download of the street network (in OSM language called highway) to the bounding box of the cycle hire data, and attach the corresponding data as an `sf`-object.
`osmdata_sf` returns a list with several spatial objects (points, lines, polygons, etc.).
Here, we will only keep the line objects.
To learn more about how to use the **osmdata**-package, please refer to [https://ropensci.github.io/osmdata/](https://ropensci.github.io/osmdata/).
```{r, eval=FALSE}
library(sf)
library(osmdata)
b_box = sf::st_bbox(cycle_hire)
streets = opq(b_box) %>%
add_osm_feature(key = "highway") %>%
osmdata_sf() %>%
`[[`(., "osm_lines")
```
Now that we have the data, we can go on and initiate a GRASS session, i.e., we have to create a GRASS geodatabase.
The GRASS geodatabase system is based on SQLite.
Consequently, different users can easily work on the same project, possibly with different read/write permissions.
However, one has to set up this geodatabase (also from within R), and users used to a GIS GUI popping up by one click might find this process a bit intimidating in the beginning.
First of all, the GRASS database requires its own directory, and contains a location (see the [GRASS GIS Quickstart](https://grass.osgeo.org/grass72/manuals/helptext.html) document and [GRASS GIS Database](https://grass.osgeo.org/grass75/manuals/grass_database.html) help pages at [grass.osgeo.org](https://grass.osgeo.org/grass75/manuals/index.html) for further information).
The location in turn simply contains the geodata for one project.
Within one location several mapsets can exist, and typically refer to different users.
PERMANENT is a mandatory mapset, and created automatically.
It stores the projection, the spatial extent and the default resolution for raster data.
In order to share spatial data with all users of a project, the database owner can add spatial data to the PERMANENT mapset.
Please refer to @neteler_open_2008 and the [GRASS GIS quick start](https://grass.osgeo.org/grass72/manuals/helptext.html) for more information on the GRASS geodatabase system.
You have to set up a location and a mapset if you want to use GRASS from within R.
First of all, we need to find out if and where GRASS7 is installed on the computer.
```{r}
library(link2GI)
link = findGRASS()
```
`link` is a `data.frame` which contains in its rows the GRASS 7 installations on your computer.
Here, we will use a GRASS7 installation.
If you have not installed GRASS7 on your computer, we recommend that you do so now.
Assuming that we have found a working installation on your computer, we use the corresponding path in `initGRASS`.
Additionally, we specify where to store the geodatabase (gisDbase), name the location `london`, and use the PERMANENT mapset.
```{r, eval=FALSE}
library(rgrass7)
# find a GRASS7 installation, and use the first one
ind = grep("7", link$version)[1]
# next line of code only necessary if we want to use GRASS as installed by
# OSGeo4W. Among others, this adds some paths to PATH, which are also needed
# for running GRASS.
link2GI::paramGRASSw(link[ind, ])
grass_path = ifelse(!is.null(link$installation_type) &&
link$installation_type[ind] == "osgeo4W",
file.path(link$instDir[ind], "apps/grass", link$version[ind]),
link$instDir)
initGRASS(gisBase = grass_path,
# home parameter necessary under UNIX-based systems
home = tempdir(),
gisDbase = tempdir(), location = "london",
mapset = "PERMANENT", override = TRUE)
```
Subsequently, we define the projection, the extent and the resolution.
```{r, eval=FALSE}
execGRASS("g.proj", flags = c("c", "quiet"),
proj4 = sf::st_crs(streets)$proj4string)
b_box = sf::st_bbox(streets)
execGRASS("g.region", flags = c("quiet"),
n = as.character(b_box["ymax"]), s = as.character(b_box["ymin"]),
e = as.character(b_box["xmax"]), w = as.character(b_box["xmin"]),
res = "1")
```
Once you are familiar how to set up the GRASS environment, it becomes tedious to do so over and over again.
Luckily, `linkGRASS7` of the **link2GI** packages lets you do it with one line of code.
The only thing you need to provide is a spatial object which determines the projection and the extent of the geodatabase.
First, `linkGRASS7` finds all GRASS installations on your computer.
Since we have set `ver_select` to `TRUE`, we can interactively choose one of the found GRASS-installations.
If there is just one installation, the `linkGRASS7` automatically chooses this one.
Secondly, `linkGRASS7` establishes a connection to GRASS7.
```{r, eval=FALSE}
link2GI::linkGRASS7(streets, ver_select = TRUE)
```
Before we can use GRASS geoalgorithms, we need to add data to GRASS's spatial database.
Luckily, the convenience function `writeVECT` does this for us.
(Use `writeRast` in the case of raster data.)
In our case we add the street and cycle hire point data while using only the first attribute column, and name them also `streets` and `points`.
<!-- in time we can also use sf- and raster-objects with rgrass7, Roger has already implemented this in the rgrass7 developer version -->
```{r, eval=FALSE}
writeVECT(as(streets[, 1], "Spatial"), vname = "streets")
writeVECT(SDF = as(points[, 1], "Spatial"), vname = "points")
```
To perform our network analysis, we need a topological clean street network.
GRASS's `v.clean` takes care of the removal of duplicates, small angles and dangles, among others.
Here, we break lines at each intersection to ensure that the subsequent routing algorithm can actually turn right or left at an intersection, and save the output in a GRASS object named `streets_clean`.
Probably a few of our cycling station points do not exactly lie on a street segment.
However, to find the shortest route between them, we need to connect them to the nearest streets segment.
`v.net`'s connect-operator does exactly this.
We save its output in `streets_points_con`.
```{r, eval=FALSE}
execGRASS(cmd = "v.clean", input = "streets", output = "streets_clean",
tool = "break", flags = "overwrite")
execGRASS(cmd = "v.net", input = "streets_clean", output = "streets_points_con",
points = "points", operation = "connect", threshold = 0.001,
flags = c("overwrite", "c"))
```
The resulting clean dataset serves as input for the `v.net.salesman`-algorithm, which finally finds the shortest route between all cycle hire stations.
`center_cats` requires a numeric range as input.
This range represents the points for which a shortest route should be calculated.
Since we would like to calculate the route for all cycle stations, we set it to `1-25`.
To access the GRASS help page of the traveling salesman algorithm, run `execGRASS("g.manual", entry = "v.net.salesman")`.
```{r, eval=FALSE}
execGRASS(cmd = "v.net.salesman", input = "streets_points_con",
output = "shortest_route", center_cats = paste0("1-", nrow(points)),
flags = c("overwrite"))
```
To visualize our result, we import the output layer into R, and visualize it with the help of the **mapview** package (Figure \@ref(fig:grass-mapview)).
```{r grass-mapview, fig.cap="Shortest route between 25 cycle hire station on the OSM street network of London.", echo=FALSE}
knitr::include_graphics("https://user-images.githubusercontent.com/1825120/39206067-544455c8-47f4-11e8-8725-e28299e01f52.png")
```
```{r, eval=FALSE}
library(mapview)
route = readVECT("shortest_route")
mapview(route, map.types = "OpenStreetMap.BlackAndWhite", lwd = 9) +
mapview(points)
```
Further notes:
- Please note that we have used GRASS's geodatabase (based on SQLite) which allows faster processing.
That means we have only exported spatial data in the beginning.
Then we created new objects but only imported the final result back into R.
To find out which datasets are currently available, run `execGRASS("g.list", type = "vector,raster", flags = "p")`.
- Of course, we could have also accessed an already existing GRASS geodatabase from within R.
Prior to importing data into R, you might want to perform some (spatial) subsetting.
Use `v.select` and `v.extract` for vector data.
`db.select` lets you select subsets of the attribute table of a vector layer without returning the correponding geometry.
- You can start R also from within a running GRASS session [for more information please refer to @bivand_applied_2013 and this [wiki](https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7)].
- Refer to the excellent [GRASS online help](https://grass.osgeo.org/grass72/manuals/) or `execGRASS("g.manual", flags = "i")` for more information on each available GRASS geoalgorithm.
- If you would like to use GRASS 6 from within R, use the R package **spgrass6**.
## When to use what?
To recommend a single R-GIS interface is hard since the usage depends on personal preferences, the tasks at hand and your familiarity with different GIS.
The latter means if you have already preferred the GUI of a certain GIS, you are quite likely to use the corresponding interface.
Certainly, **RQGIS** is an appropriate choice for most use cases.
Its main advantages are:
- An unified access to several GIS, and therefore the provision of >1000 geoalgorithms.
Of course, this includes duplicated functionality, e.g., you can perform overlay-operations using QGIS-, SAGA- or GRASS-geoalgorithms.
- The automatic data format conversions. For instance, SAGA uses `.sdat` grid files and GRASS uses its own database format but QGIS will handle the corresponding conversions for you on the fly.
- **RQGIS** can also handle spatial objects residing in R as input for geoalgorithms, and loads QGIS output automatically back into R if desired.
- Its convenience functions to support the access of the online help, R named arguments and automatic default value retrieval.
Please note that **rgrass7** inspired the latter two features.
- Currently (but this might change), **RQGIS** supports newer SAGA (2.3.1) versions than **RSAGA** (2.2.3).
However, there are use cases when you certainly should use one of the other R-GIS bridges.
QGIS only provides access to a subset of GRASS and SAGA functionality.
Therefore, to use the complete set of SAGA and GRASS functions, stick with **RSAGA** and **rgrass7**.
When doing so, make advantage of **RSAGA**'s numerous user-friendly functions.
Note also, that **RSAGA** offers native R functions for geocomputation such as `multi.local.function`, `pick.from.grid` and many more.
Finally, if you need topological correct data and/or geodatabase-management functionality, we recommend the usage of GRASS.
In addition, if you would like to run simulations with the help of a geodatabase [@krug_clearing_2010], use **rgrass7** directly since **RQGIS** always starts a new GRASS session for each call.
## Exercises
1. Create two overlapping polygons (`poly_1` and `poly_2`) with the help of the **sf**-package (see chapter \@ref(spatial-class)).
Calculate the intersection using:
- **RQGIS**, **RSAGA** and **rgrass7**
- **sf**
2. Run `data(dem, package = "RQGIS")` and `data(random_points, package = "RQGIS")`.
Select randomly a point from `random_points` and find all `dem` pixels that can be seen from this point (hint: viewshed).
Visualize your result.
For example, plot a hillshade, and on top of it the digital elevation model, your viewshed output and the point.
Additionally, give `mapview` a try.