Skip to content

Commit 3408a91

Browse files
authored
Merge pull request #195 from inbo/density_fv
'estimate densities from point patterns': PROJ>=6 compliance
2 parents aad7b19 + 1e79957 commit 3408a91

File tree

2 files changed

+10618
-0
lines changed

2 files changed

+10618
-0
lines changed
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: Estimating densities from a point pattern
3+
maintainer: Thierry Onkelinx
4+
date: '2017-06-28'
5+
output:
6+
html_document: default
7+
md_document:
8+
preserve_yaml: true
9+
variant: gfm
10+
params:
11+
cellsize: 20
12+
---
13+
14+
In this example we focus on a set of 10450 coordinates in a small area.
15+
The goal is to estimate the local density of points, expressed as the number of points per unit area.
16+
The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate reference system.
17+
That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate reference system.
18+
In this case we use [Belgian Lambert72 (EPSG:3170)](https://epsg.io/31370).
19+
Next we calculate the density.
20+
To visualise the density, we have to transform the results back to WGS84.
21+
22+
The data used in this example is real data but centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628).
23+
24+
First we must read the data into R. Plotting the raw data helps to check errors in the data.
25+
26+
```{r read-data, fig.cap = "Raw data"}
27+
points <- read.delim("point-pattern/points.txt", sep = " ")
28+
library(ggmap)
29+
map <- get_map(
30+
location = c(lon = mean(points$lon), lat = mean(points$lat)),
31+
zoom = 17,
32+
maptype = "satellite",
33+
source = "google"
34+
)
35+
ggmap(map) +
36+
geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4)
37+
```
38+
39+
The next step is to convert the dataset to a `SpatialPoints` object with WGS84 projection and re-project it to Belgian Lambert72.
40+
`sp::CRS()` defines the coordinate reference systems (CRS).
41+
`sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a CRS.
42+
Therefore we need to override the CRS with the correct one.
43+
44+
`sp` will derive a 'WKT2 string' representation from an EPSG-code [^epsg] that we provide, in order to represent the CRS.
45+
The WKT2 string (well known text) is a recent open standard by the Open Geospatial Consortium to represent a CRS, and it replaces the older (deprecated) 'PROJ.4 string'.
46+
Currently, the function to set the CRS is still called `proj4string()`.
47+
`sp::spTransform()` converts the spatial object from the current CRS to another CRS.
48+
49+
[^epsg]: Most coordinate reference systems have an [EPSG](https://en.wikipedia.org/wiki/International_Association_of_Oil_%26_Gas_Producers#European_Petroleum_Survey_Group) code which you can find at http://epsg.io/.
50+
51+
```{r crs-objects}
52+
library(sp)
53+
crs_wgs84 <- CRS(SRS_string = "EPSG:4326")
54+
crs_lambert <- CRS(SRS_string = "EPSG:31370")
55+
```
56+
57+
The warning above once again demonstrates that some PROJ.4 information is not supported anymore.
58+
59+
```{r reproject}
60+
coordinates(points) <- ~lon + lat
61+
# proj4string() - still the only available function to set the CRS - may in the
62+
# future get a more general name:
63+
proj4string(points) <- crs_wgs84
64+
points_lambert <- spTransform(points, crs_lambert)
65+
```
66+
67+
Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area.
68+
69+
We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons.
70+
71+
```{r density}
72+
library(MASS)
73+
library(dplyr)
74+
xlim <- range(pretty(points_lambert$lon)) + c(-100, 100)
75+
ylim <- range(pretty(points_lambert$lat)) + c(-100, 100)
76+
n <- c(
77+
diff(xlim),
78+
diff(ylim)
79+
) / params$cellsize + 1
80+
dens <- kde2d(
81+
x = points_lambert$lon,
82+
y = points_lambert$lat,
83+
n = n,
84+
lims = c(xlim, ylim)
85+
)
86+
dx <- diff(dens$x[1:2])
87+
dy <- diff(dens$y[1:2])
88+
sum(dens$z * dx * dy)
89+
dens <- expand.grid(
90+
lon = dens$x,
91+
lat = dens$y
92+
) %>%
93+
mutate(
94+
density = as.vector(dens$z) * length(points_lambert),
95+
id = seq_along(density)
96+
) %>%
97+
merge(
98+
data.frame(
99+
x = dx * (c(0, 0, 1, 1, 0) - 0.5),
100+
y = dy * (c(0, 1, 1, 0, 0) - 0.5)
101+
)
102+
) %>%
103+
mutate(
104+
lon = lon + x,
105+
lat = lat + y
106+
)
107+
```
108+
109+
In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image.
110+
111+
```{r ggmap, fig.cap = "Static image of density"}
112+
coordinates(dens) <- ~lon + lat
113+
proj4string(dens) <- crs_lambert
114+
dens_wgs <- spTransform(dens, crs_wgs84) %>%
115+
as.data.frame()
116+
ggmap(map) +
117+
geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) +
118+
scale_fill_gradientn(
119+
"density\n(#/m²)",
120+
colours = rev(rainbow(100, start = 0, end = .7)),
121+
limits = c(0, NA)
122+
)
123+
```
124+
125+
Using `leaflet` to generate a map was a bit more laborious.
126+
Using the `data.frame dens_wgs`directly failed.
127+
So we converted the `data.frame` to a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`.
128+
The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`.
129+
A `Polygons` object consists of a list of one or more `Polygon` objects.
130+
In this example it is a single polygon which represents the grid cell.
131+
132+
```{r convert-to-Spatial-Polygons}
133+
dens_sp <- lapply(
134+
unique(dens_wgs$id),
135+
function(i){
136+
filter(dens_wgs, id == i) %>%
137+
select(lon, lat) %>%
138+
Polygon() %>%
139+
list() %>%
140+
Polygons(ID = i)
141+
}
142+
) %>%
143+
SpatialPolygons() %>%
144+
SpatialPolygonsDataFrame(
145+
data = dens_wgs %>%
146+
distinct(id, density),
147+
match.ID = FALSE
148+
)
149+
```
150+
151+
`leaflet` requires a predefined function with a colour palette.
152+
We use `leaflet::colorNumeric()` to get a continuous palette.
153+
Setting `stroke = FALSE` removes the borders of the polygon.
154+
`fillOpacity` sets the transparency of the polygons.
155+
156+
```{r leaflet, fig.cap = "Dynamic map of density"}
157+
library(leaflet)
158+
pal <- colorNumeric(
159+
palette = rev(rainbow(100, start = 0, end = .7)),
160+
domain = c(0, dens_sp$density)
161+
)
162+
leaflet(dens_sp) %>%
163+
addTiles() %>%
164+
addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>%
165+
addLegend(pal = pal, values = ~density)
166+
```
167+

0 commit comments

Comments
 (0)