|
| 1 | +--- |
| 2 | +title: Estimating densities from a point pattern |
| 3 | +maintainer: Thierry Onkelinx |
| 4 | +date: '2017-06-28' |
| 5 | +output: |
| 6 | + html_document: default |
| 7 | + md_document: |
| 8 | + preserve_yaml: true |
| 9 | + variant: gfm |
| 10 | +params: |
| 11 | + cellsize: 20 |
| 12 | +--- |
| 13 | + |
| 14 | +In this example we focus on a set of 10450 coordinates in a small area. |
| 15 | +The goal is to estimate the local density of points, expressed as the number of points per unit area. |
| 16 | +The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate reference system. |
| 17 | +That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate reference system. |
| 18 | +In this case we use [Belgian Lambert72 (EPSG:3170)](https://epsg.io/31370). |
| 19 | +Next we calculate the density. |
| 20 | +To visualise the density, we have to transform the results back to WGS84. |
| 21 | + |
| 22 | +The data used in this example is real data but centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628). |
| 23 | + |
| 24 | +First we must read the data into R. Plotting the raw data helps to check errors in the data. |
| 25 | + |
| 26 | +```{r read-data, fig.cap = "Raw data"} |
| 27 | +points <- read.delim("point-pattern/points.txt", sep = " ") |
| 28 | +library(ggmap) |
| 29 | +map <- get_map( |
| 30 | + location = c(lon = mean(points$lon), lat = mean(points$lat)), |
| 31 | + zoom = 17, |
| 32 | + maptype = "satellite", |
| 33 | + source = "google" |
| 34 | +) |
| 35 | +ggmap(map) + |
| 36 | + geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4) |
| 37 | +``` |
| 38 | + |
| 39 | +The next step is to convert the dataset to a `SpatialPoints` object with WGS84 projection and re-project it to Belgian Lambert72. |
| 40 | +`sp::CRS()` defines the coordinate reference systems (CRS). |
| 41 | +`sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a CRS. |
| 42 | +Therefore we need to override the CRS with the correct one. |
| 43 | + |
| 44 | +`sp` will derive a 'WKT2 string' representation from an EPSG-code [^epsg] that we provide, in order to represent the CRS. |
| 45 | +The WKT2 string (well known text) is a recent open standard by the Open Geospatial Consortium to represent a CRS, and it replaces the older (deprecated) 'PROJ.4 string'. |
| 46 | +Currently, the function to set the CRS is still called `proj4string()`. |
| 47 | +`sp::spTransform()` converts the spatial object from the current CRS to another CRS. |
| 48 | + |
| 49 | +[^epsg]: Most coordinate reference systems have an [EPSG](https://en.wikipedia.org/wiki/International_Association_of_Oil_%26_Gas_Producers#European_Petroleum_Survey_Group) code which you can find at http://epsg.io/. |
| 50 | + |
| 51 | +```{r crs-objects} |
| 52 | +library(sp) |
| 53 | +crs_wgs84 <- CRS(SRS_string = "EPSG:4326") |
| 54 | +crs_lambert <- CRS(SRS_string = "EPSG:31370") |
| 55 | +``` |
| 56 | + |
| 57 | +The warning above once again demonstrates that some PROJ.4 information is not supported anymore. |
| 58 | + |
| 59 | +```{r reproject} |
| 60 | +coordinates(points) <- ~lon + lat |
| 61 | +# proj4string() - still the only available function to set the CRS - may in the |
| 62 | +# future get a more general name: |
| 63 | +proj4string(points) <- crs_wgs84 |
| 64 | +points_lambert <- spTransform(points, crs_lambert) |
| 65 | +``` |
| 66 | + |
| 67 | +Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area. |
| 68 | + |
| 69 | +We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons. |
| 70 | + |
| 71 | +```{r density} |
| 72 | +library(MASS) |
| 73 | +library(dplyr) |
| 74 | +xlim <- range(pretty(points_lambert$lon)) + c(-100, 100) |
| 75 | +ylim <- range(pretty(points_lambert$lat)) + c(-100, 100) |
| 76 | +n <- c( |
| 77 | + diff(xlim), |
| 78 | + diff(ylim) |
| 79 | +) / params$cellsize + 1 |
| 80 | +dens <- kde2d( |
| 81 | + x = points_lambert$lon, |
| 82 | + y = points_lambert$lat, |
| 83 | + n = n, |
| 84 | + lims = c(xlim, ylim) |
| 85 | +) |
| 86 | +dx <- diff(dens$x[1:2]) |
| 87 | +dy <- diff(dens$y[1:2]) |
| 88 | +sum(dens$z * dx * dy) |
| 89 | +dens <- expand.grid( |
| 90 | + lon = dens$x, |
| 91 | + lat = dens$y |
| 92 | +) %>% |
| 93 | + mutate( |
| 94 | + density = as.vector(dens$z) * length(points_lambert), |
| 95 | + id = seq_along(density) |
| 96 | + ) %>% |
| 97 | + merge( |
| 98 | + data.frame( |
| 99 | + x = dx * (c(0, 0, 1, 1, 0) - 0.5), |
| 100 | + y = dy * (c(0, 1, 1, 0, 0) - 0.5) |
| 101 | + ) |
| 102 | + ) %>% |
| 103 | + mutate( |
| 104 | + lon = lon + x, |
| 105 | + lat = lat + y |
| 106 | + ) |
| 107 | +``` |
| 108 | + |
| 109 | +In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image. |
| 110 | + |
| 111 | +```{r ggmap, fig.cap = "Static image of density"} |
| 112 | +coordinates(dens) <- ~lon + lat |
| 113 | +proj4string(dens) <- crs_lambert |
| 114 | +dens_wgs <- spTransform(dens, crs_wgs84) %>% |
| 115 | + as.data.frame() |
| 116 | +ggmap(map) + |
| 117 | + geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) + |
| 118 | + scale_fill_gradientn( |
| 119 | + "density\n(#/m²)", |
| 120 | + colours = rev(rainbow(100, start = 0, end = .7)), |
| 121 | + limits = c(0, NA) |
| 122 | + ) |
| 123 | +``` |
| 124 | + |
| 125 | +Using `leaflet` to generate a map was a bit more laborious. |
| 126 | +Using the `data.frame dens_wgs`directly failed. |
| 127 | +So we converted the `data.frame` to a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`. |
| 128 | +The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`. |
| 129 | +A `Polygons` object consists of a list of one or more `Polygon` objects. |
| 130 | +In this example it is a single polygon which represents the grid cell. |
| 131 | + |
| 132 | +```{r convert-to-Spatial-Polygons} |
| 133 | +dens_sp <- lapply( |
| 134 | + unique(dens_wgs$id), |
| 135 | + function(i){ |
| 136 | + filter(dens_wgs, id == i) %>% |
| 137 | + select(lon, lat) %>% |
| 138 | + Polygon() %>% |
| 139 | + list() %>% |
| 140 | + Polygons(ID = i) |
| 141 | + } |
| 142 | +) %>% |
| 143 | + SpatialPolygons() %>% |
| 144 | + SpatialPolygonsDataFrame( |
| 145 | + data = dens_wgs %>% |
| 146 | + distinct(id, density), |
| 147 | + match.ID = FALSE |
| 148 | + ) |
| 149 | +``` |
| 150 | + |
| 151 | +`leaflet` requires a predefined function with a colour palette. |
| 152 | +We use `leaflet::colorNumeric()` to get a continuous palette. |
| 153 | +Setting `stroke = FALSE` removes the borders of the polygon. |
| 154 | +`fillOpacity` sets the transparency of the polygons. |
| 155 | + |
| 156 | +```{r leaflet, fig.cap = "Dynamic map of density"} |
| 157 | +library(leaflet) |
| 158 | +pal <- colorNumeric( |
| 159 | + palette = rev(rainbow(100, start = 0, end = .7)), |
| 160 | + domain = c(0, dens_sp$density) |
| 161 | +) |
| 162 | +leaflet(dens_sp) %>% |
| 163 | + addTiles() %>% |
| 164 | + addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>% |
| 165 | + addLegend(pal = pal, values = ~density) |
| 166 | +``` |
| 167 | + |
0 commit comments