Skip to content

Managing and visualizing movement data with PostGIS and R

Mathieu Basille edited this page Mar 23, 2016 · 22 revisions

Background

Recent technological progress allowed ecologists to obtain a huge amount and diversity of animal movement data sets of increasing spatial and temporal resolution and size, together with complex associated information related to the environmental context, such as habitat types based on remote sensing, population density, weather. Most advanced movement data management now relies on the use of an integrated database system based on PostGIS, an extension of the open-source database management system PostgreSQL that adds support for spatial data.

Storing spatial objects in a PostGIS-enabled database is particularly useful for movement data (usually from wildlife collars/sensors), which are often very large, regularly updated, and require cleaning and manipulation prior to being used in research. On the other end of the process, the advancement of a movement ecology theoretical framework led to an unprecedented development of new analytical tools and methods, mostly available in the R statistical environment.

This project focuses on streamlining the workflow for biologists storing/processing movement data in PostGIS and analyzing it in R, and aims at providing the tools to transparently benefit from the power of the most advanced database and statistical systems available for movement data.

Related work

Four other packages are worth mentioning here:

  1. rgdal: rgdal provides bindings to the Geospatial Data Abstraction Library (GDAL), which allows R to import and export spatial data in the form of points, lines, polygons or rasters. While rgdal (and underlying GDAL) can really be considered as the Swiss Army knife of handling GIS data, its scope is very general and focuses on standard GIS spatial classes. Using it in a specialized context, with specific data such as movements, is fairly cumbersome and tedious, if not simply impossible. Note that rgdal provides limited communication capability to PostGIS.
  2. RPostgreSQL: RPostgreSQL is a database interface and PostgreSQL driver for R. In other words, RPostgreSQL allows for bidirectional communication between R and PostgreSQL, but does not provide PostGIS features to handle spatial data.
  3. rpostgis: rpostgis provides additional functions to RPostgreSQL to enable importing and exporting spatial data between PostGIS and R. The aim of this package is however generic, and it does not features to handle movement data.
  4. adehabitatLT: adehabitatLT is a collection of tools for the analysis of animal movements. In particular, it builds on a dedicated class for animal movement data (ltraj), which abstract movement to a set of trajectories and its geometrical descriptors. However, adehabitatLT does not provide automated tools to import trajectories.

Details of your coding project

The core objective of this project is to create a new R package (rpostgisLT) which will streamline location dataset processing into trajectories, including full integration with the R package adehabitatlt data type ltraj. The main end product is thus the publication of rpostgisLT on CRAN, which involves the development of functions, documentation and examples necessary for a full-fledged R package.

The basic workflow will involve:

  1. Definition of a new Postgresql data structure (schema/views/tables) to store a new PostGIS data type pgltraj (being the PogtGIS database version of the ltraj);
  2. Creation of a visualization tool for pgltraj’s using Shiny/leaflet/etc. to interactively process in-database pgltraj’s;
  3. Writing functions for seamless transitioning between in-database pgltraj and in-R ltraj (and vice versa), which will allow pgltraj objects to access the full functionality of the adehabitat suite, and ltraj objects to be consistently stored in PostGIS.

This package (rpostgisLT) will thus require R functions that do a one-time “installation” on a PostgreSQL database in R, setting up the new PostgreSQL data structure for storing pgltraj and their ancilliary information (either from in-database or from an existing ltraj). This will involve significant SQL and PL/pgSQL programming in addition to R, and can take advantage of the PostGIS geography data type as the standard for pgltraj.

Incidently, a first step of this project will involve consolidating the existing rpostgis package, by extending and improving functions to import spatial dataset from PostGIS in R (as sp objects, these functions already exist but will be extended), and export sp objects back to the database. The package rpostgis will also be published on CRAN.

Expected impact

The R community will benefit from this project in two ways:

  1. The rpostgisLT package will provide a unique opportunity to unleash the combined power of PostGIS and R in the study of animal movement, one of the most dynamic field in ecology.
  2. The additional development of rpostgis will also provide generic tools to allow bidirectional communication between R and PostGIS for all kinds of spatial data (points, lines, polygons and rasters), hence with a much broader focus.

Mentors

The student will be mentored by three experts:

  • Mathieu Basille (basille@ufl.edu) is an Assistant Professor at the University of Florida, with his main program dealing with animal movement and distribution. As a quantitative ecologist, he is bringing an extensive knowledge of R, in particular in the context of movement ecology.
  • David Bucklin (dbucklin@ufl.edu) is a geographer specialized in spatial technologies (GIS, remote sensing, and GPS), with an emphasis on their application in conservation biology and ecology; he is an expert in spatial databases and spatial data management, and develops tools and techniques for geo-processing workflows.
  • Clement Calenge (clement.calenge@oncfs.gouv.fr) is a biometrician at the intersection of three scientific domains: biology, statistics and computer science. He developed the adehabitat suite to provide adequate mathematical models and statistical methods to analyse biological data structures, such as animal location or movement data.

Note that all three mentors are generally well versed in all aspects involved in the project (PostGIS, R, movement data).

Tests

We are looking for a motivated student that shows fluency in SQL and strong familiarity in R. The emphasis, in terms of advanced skills, is put on SQL, and particularly indexes (for spatial and temporal data). Familiarity in R is also required, but necessary skills for the completion of the project (such as building a R package) can be learned during the project.

We propose the following test:

  • From R, take a SpatialPointsDataFrame with a time column, and export it to PostGIS;
  • Build the necessary spatial and temporal indexes;
  • Write a SQL function that select the points within a spatio-temporal window (i.e. given X and Y boundaries and time limits).

Here is a starting point that can be used as example:

library("spacetime")
library("sp")

data(fires)
fires$X <- fires$X * 100000
fires$Y <- fires$Y * 100000
fires$Time <- as.POSIXct(as.Date("1960-01-01")+(fires$Time-1))

coordinates(fires) <- c("X", "Y")
proj4string(fires) <- CRS("+init=epsg:2229 +ellps=GRS80")
plot(fires, pch = 3)

Then here is the outcome for points within X = 6400000 and 6500000, Y = 1950000 and 2050000, and during the 90s:

(subfires <- subset(fires, coordinates(fires)[, 1] >= 6400000 
    & coordinates(fires)[, 1] <= 6500000 
    & coordinates(fires)[, 2] >= 1950000 
    & coordinates(fires)[, 2] <= 2050000 
    & fires$Time >= as.POSIXct("1990-01-01") 
    & fires$Time < as.POSIXct("2000-01-01")))

rect(6400000, 1950000, 6500000, 2050000, border = "red", lwd = 2)
points(subfires, col = "red")

The SQL function should be able to retrieve the points in red making good use of indexes.

Solutions of tests

Students, please post a link to your test results here.

Balázs Dukai

https://github.com/balazsdukai/rpostgisLT_test/blob/master/test.R

Shivam Rana

https://github.com/TrigonaMinima/rpostgisLT-tests

Nistara Randhawa

https://github.com/nistara/gsoc2016_movement/blob/master/gsoc_nistara.R

Yujiao Li

https://github.com/liyujiao1026/gsoc2016/blob/master/Yujiao_gsoc.R

Clone this wiki locally