-
Notifications
You must be signed in to change notification settings - Fork 18
/
README.Rmd
307 lines (222 loc) · 11 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
---
output: github_document
editor_options:
chunk_output_type: console
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
devtools::load_all()
library(sf)
library(tidyverse)
options(width = 1000)
```
> :warning: **This package is not owned, run, or endorsed by the ABS.**
Data contained in this package are compressed, re-projected, renamed and stored as `sf` objects to be useful for making maps in R.
If conducting spatial analysis, or any analysis that requires precise area boundaries, please use the original shapefiles provided by the ABS and others.
# absmapsdata
<!-- badges: start -->
[![Lifecycle:
stable](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://www.tidyverse.org/lifecycle/#stable)
[![R build status](https://github.com/wfmackey/absmapsdata/workflows/R-CMD-check/badge.svg)](https://github.com/wfmackey/absmapsdata/actions)
<!-- badges: end -->
`absmapsdata` is a user-generated package to make it easier for R users to access ABS (and other) spatial structure names/codes and produce maps using this data. The package contains compressed (lossy), tidied, and lazily-loadable `sf` objects that hold geometric information about data structures in Australia.
It also contains a correspondences files provided by the ABS.
> :white_check_mark: **It is now recommended that you use [`strayr::read_absmap`](https://github.com/runapp-aus/strayr) to access data stored in `absmapsdata`**. To download and read these data files without installing the whole `absmapsdata` package, please use `strayr::read_absmap`, for example:
```{r, eval=TRUE, }
# remotes::install_github("runapp-aus/strayr")
strayr::read_absmap("sa42021")
```
## Installation
You probably don't need to install the full `absmapsdata` package (see above).
But if you want to, you can install `absmapsdata` from Github.
The package contains a lot of data, so installing using `remotes::install_github` may fail if the download times out. If this happens, set the timeout option to a large value and try again, i.e. run:
```{r set_timeout, eval=FALSE}
options(timeout = 1000)
remotes::install_github("wfmackey/absmapsdata")
```
The `sf` package is required to handle the `sf` objects:
```{r, eval=FALSE}
library(sf)
```
## Data loaded with this package
Available maps are listed below. These will be added to over time.
If you would like to request a map to be added, let me know via an issue on this Github repo.
**ASGS Main Structures**
* Statistical Area 1 2011: `sa12011`; 2016: `sa12016`; and 2021: `sa12021`.
* Statistical Area 2 2011: `sa22011`; 2016: `sa22016`; and 2021: `sa22021`.
* Statistical Area 3 2011: `sa32011`; 2016: `sa32016`; and 2021: `sa32021`.
* Statistical Area 4 2011: `sa42011`; 2016: `sa42016`; and 2021: `sa42021`.
* Greater Capital Cities 2011: `gcc2011`; 2016: `gcc2016`; and 2021: `gcc2021`.
* Remoteness Areas 2011: `ra2011`; and 2016: `ra2016`
* State 2011: `state2011`; 2016: `state2016`; and `state2021`.
**ASGS Indigenous Structures**
* Indigenous Locations 2021: `iloc2021`
* Indigenous Areas 2021: `iare2021`
* Indigenous Regions 2021: `ireg2021`
**Significant Urban Areas and Urban Centres and Localities**
* Significant Urban Areas 2016: `sua2016`; and 2021: `sua2021`
* Urban Centre and Locality 2016: `ucl2016`; and 2021: `ucl2021`
* Section of State Range 2016: `sosr2016`
* Section of State 2016: `sos2016`
**ASGS Non-ABS Structures**
* Commonwealth Electoral Divisions 2018: `ced2018`; and 2021: `ced2021`
* State Electoral Divisions 2018:`sed2018`; 2021: `sed2021`; and 2022: `sed2022`
* Local Government Areas 2016: `lga2016`; 2018: `lga2018`; 2021: `lga2021`; and 2022: `lga2022`
* Regions for the Internet Vacancy Index 2008: `regional_ivi2008`
* Postcodes 2016: `postcode2016`; and 2021: `postcode2021`
* Suburbs (SSC) 2016: `suburb2016`; and (SAL) 2021: `suburb2021`
* Census of Population and Housing Destination Zones 2011: `dz2011`; 2016: `dz2016`; and 2021: `dz2021`.
**Non-ABS Australian Government Structures**
* Employment Regions 2015-2020: `employment_regions2015`
* BITRE Working Zones 2016: `bitre_work_zones2016`
* NSW Local Health District 2023: `nsw_lhd2023`
* Regional Development Australia areas 2015-16: `rda2016`
**Correspondences**
This package also contains a number of 2016 population-weighted ABS correspondences (the most recent) that can be found on the [data.gov.au website](https://data.gov.au/data/dataset/asgs-geographic-correspondences-2016/resource/951e18c7-f187-4c86-a73f-fcabcd19af16).
> :white_check_mark: Use [`strayr::read_correspondence_tbl`](https://github.com/runapp-aus/strayr) to access correspondence this data, rather than loading the whole `absmapsdata` package, e.g.:
```{r, eval=TRUE, message=FALSE}
# remotes::install_github("runapp-aus/strayr")
strayr::read_correspondence_tbl(from_area = "sa2", from_year = 2011,
to_area = "sa2", to_year = 2016)
```
Within `absmapsdata`, you can retrieve correspondences with the `get_correspondence_absmaps` function.
## Just show me how to make a map with this package
### Using the package’s pre-loaded data
The `absmapsdata` package comes with pre-downloaded and pre-processed
data. To load a particular geospatial object: load the **package**, then
call the object (see list above for object names).
```{r}
library(tidyverse)
library(sf)
library(absmapsdata)
mapdata1 <- sa32021
glimpse(mapdata1)
```
Or
```{r}
mapdata2 <- sa22016
glimpse(mapdata2)
```
The resulting `sf` object contains one observation per area (in the
following examples, one observation per `sa3`). It stores the geometry
information in the `geometry` variable, which is a nested list
describing the area’s polygon. The object can be joined to a standard
`data.frame` or `tibble` and can be used with `dplyr` functions.
### Creating maps with your `sf` object
We do all this so we can create gorgeous maps. And with the `sf` object
in hand, plotting a map via `ggplot` and `geom_sf` is simple.
```{r}
map <-
sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry)) # use the geometry variable
map
```
The data also include centroids of each area, and we can add these
points to the map with the `cent_lat` and `cent_long` variables using
`geom_point`.
```{r}
map <- sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry)) + # use the geometry variable
geom_point(aes(cent_long, cent_lat)) # use the centroid long (x) and lats (y)
map
```
Cool. But this all looks a bit ugly. We can pretty it up
using `ggplot` tweaks. See the comments on each line for its objective.
Also note that we’re filling the areas by their `areasqkm` size, another
variable included in the `sf` object (we’ll replace this with more
interesting data in the next section).
```{r}
map <- sa32016 %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry, # use the geometry variable
fill = areasqkm_2016), # fill by area size
lwd = 0, # remove borders
show.legend = FALSE) + # remove legend
geom_point(aes(cent_long,
cent_lat), # use the centroid long (x) and lats (y)
colour = "white") + # make the points white
theme_void() + # clears other plot elements
coord_sf()
map
```
## Joining with other datasets
At some point, we’ll want to join our spatial data with
data-of-interest. The variables in our mapping data—stating the numeric
code and name of each area and parent area—will make this *relatively*
easy.
For example: suppose we had a simple dataset of median income by SA3
over time.
```{r}
# Read data in some data
income <- read_csv("https://raw.githubusercontent.com/wfmackey/absmapsdata/master/img/data/median_income_sa3.csv")
head(income)
```
This income data contains a variable `sa3_name_2016`, and we can use
`dplyr::left_join()` to combine with our mapping data.
```{r}
combined_data <- left_join(income,
sa32016,
by = "sa3_name_2016")
```
Now that we have a tidy dataset with 1) the income data we want to plot,
and 2) the geometry of the areas, we can plot income by area:
```{r}
map <- combined_data %>%
filter(gcc_name_2016 == "Greater Melbourne") %>% # let's just look Melbourne
ggplot() +
geom_sf(aes(geometry = geometry, # use the geometry variable
fill = median_income), # fill by unemployment rate
lwd = 0) + # remove borders
theme_void() + # clears other plot elements
labs(fill = "Median income")
map
```
## Get correspondence files
> :white_check_mark: Use [`strayr::read_correspondence_tbl`](https://github.com/runapp-aus/strayr) to access correspondence this data, rather than loading the whole `absmapsdata` package, e.g.:
```{r, eval=TRUE}
# remotes::install_github("runapp-aus/strayr")
strayr::read_correspondence_tbl(from_area = "sa2", from_year = 2011,
to_area = "sa2", to_year = 2016)
```
You can use the `absmapsdata::get_correspondence_absmaps` function to get population-weighted correspondence tables provided [by the ABS](https://data.gov.au/data/dataset/asgs-geographic-correspondences-2016/resource/951e18c7-f187-4c86-a73f-fcabcd19af16).
Note that while there are lots of correspondence tables, not every combination is available.
For example:
```{r}
get_correspondence_absmaps("cd", 2006,
"sa1", 2016)
```
## Why does this package exist?
The motivation for this package is that maps are cool and fun and are,
sometimes, the best way to communicate data. And making maps is `R` with
`ggplot` is relatively easy *when you have the right `object`*.
Getting the right `object` is not technically difficult, but requires
research into the best-thing-to-do at each of the following steps:
- Find the ASGS ABS spatial-data page and determine the right file to
download.
- Read the shapefile into `R` using one-of-many import tools.
- Convert the object into something usable.
- Clean up any inconsistencies and apply consistent variable
naming/values across areas and years.
- Find an appropriate compression function and level to optimise
output.
For me at least, finding the correct information and developing the
best set of steps was a little bit interesting but mostly tedious and
annoying. The `absmapsdata` package holds this data for you, so you can
spend more time making maps, and less time on Stack Overflow, the ABS
website, and [lovely-people’s wonderful
blogs](https://www.neonscience.org/dc-open-shapefiles-r).
## Comments/complaints/requests
The best avenue is via a Github issue at
[wfmackey/absmapsdata/issues](https://github.com/wfmackey/absmapsdata/issues).
This is also the best place to request data that isn't yet available in the package.