Skip to content

r-wenger/Pickshift

Repository files navigation

PickShift

Cite this software

If you use PickShift for academic research or as part of any scientific publication, we ask you to cite it. Below is a sample citation:

Timothée Jautzy, Pierrick Freys, Valentin Chardon, Romain Wenger, Gilles Rixhon, Laurent Schmitt, Pierre-Alexis Herrault, PickShift: A user-friendly Python tool to assess the surficial uncertainties associated with polygons extracted from historical planimetric data, SoftwareX, Volume 27, 2024, 101866, ISSN 2352-7110, https://doi.org/10.1016/j.softx.2024.101866.

@article{JAUTZY2024101866,
title = {PickShift: A user-friendly Python tool to assess the surficial uncertainties associated with polygons extracted from historical planimetric data},
journal = {SoftwareX},
volume = {27},
pages = {101866},
year = {2024},
issn = {2352-7110},
doi = {https://doi.org/10.1016/j.softx.2024.101866},
url = {https://www.sciencedirect.com/science/article/pii/S235271102400236X},
author = {Timothée Jautzy and Pierrick Freys and Valentin Chardon and Romain Wenger and Gilles Rixhon and Laurent Schmitt and Pierre-Alexis Herrault}
}

Overview

PickShift performs geospatial analysis using Monte Carlo simulations to (1) compute the spatially variable error affecting any historical planimetric data and (2) quantify the surficial uncertainty associated to digitized features. The script uses various Python libraries such as pandas, geopandas, gdal, and others for data manipulation, transformation, and spatial analysis.

PickShift consists of three files:

  • environment_{linux|windows}.yml
    It allows the operator to automatically configure and build an Anaconda environment that suits the appropriate operating system.
  • config.conf
    It contains the location of the required inputs and few parameters that can be modified by the operator.
  • pickshift.py
    It corresponds to the main source code that has to be run once the Anaconda environment is created and the configuration file parameterized.

Video tutorial

This video explains how to use PickShift. We recommend watching it. Watch the video

Test data

Here is a link to download the data presented in our article: Download test data

Installation

We recommend using Anaconda to run PickShift, as it allows to automatically configure and build an Anaconda environment that suits the appropriate operating system. Please refer to the official instructions to install it, depending on your operating system: https://docs.anaconda.com/free/anaconda/install/index.html

Once Anaconda is installed on your operating system and the Anaconda environement created, you will be able to run PickShift from the terminal (for Linux/macOS) or the Anaconda prompt (for Windows).

Create PickShift environment

Set up the PickShift environment with the provided environment_{linux|windows}.yml file. Use the one who corresponds to you operating system. Run the following command to create the PickShift environment:

conda env create -f environment_{linux|windows}.yml

Run the following command to activate the PickShift environment:

conda activate envpickshift

Run PickShift

Once the PickShift environment is created and activated, you can run PickShift using the following command:

python pickshift.py -c config.conf

NB: The required inputs must have been prepared and the configuration file (config.conf) parameterized before running PickShift. See sections below.

Inputs required

PickShift requires three inputs that have be to prepared by the operator using any GIS software (e.g. QGIS, ArcGIS):

  • a shapefile of the spatial extent of the studied area (extent, geopackage format);
  • a shapefile of the features digitized from the historical planimetric data (polygons, geopackage format);
  • a set of independent ground control points (GCP, txt format).

The input files must share the same projected coordinates system.

File name File format Description/Recommendations
GCP .txt Contains the coordinates (XY) of each pair of Ground Control Points, picked from the targeted and the reference planimetric data. The coordinates can be in any projected coordinate system, in meters. The file must be tab-separated and contain only the following fields: ‘Xref’, ‘Yref’, ‘’Xinit’, ‘Yinit’.
polygons .gpkg Corresponds to the features of interests digitized from the targeted planimetric data. The file must include the two following fields:
- ‘id’ (the automatic QGIS ‘fid’ field is not enough)
- type
extent .gpkg Corresponds to the spatial extent of the studied area, where the spatially-variable error is interpolated. We suggest to delineate it a bit larger than the area covered by the polygons.

Configuration File

The script requires a configuration file (config.conf) for its parameters. Here's a screenshot of the configuration file:

Config file

The following table describes how to fill it.

Parameter name Type Units Description Comments/Recommendations
GCP character / Location of the GCP file. /
extent character / Location of the extent file. /
polygons character / Location of the polygons file. /
crs integer / Projected coordinate system of the input files. EPSG format. Must be the same for all input files.
resol_x integer meters X resolution of the SVE raster. Use the same resolution as the targeted data resolution.
resol_y integer meters Y resolution of the SVE raster. Use the same resolution as the targeted data resolution.
buffer integer meters Radius of the buffer used to extract the SVE around each vertices. Ten times the resolution of the targeted data.
runs integer / Number of Monte-Carlos simulations. 1000 simulations is the right number, but we suggest to first test with 10.
digit_error integer meters Digitization error. Minimum the same value as the targeted data resolution.
douglas_peucker boolean / 'True’ : to simplify the polygons after reconstruction.'False’ : without simplification. Use only if the Monte-Carlo simulations induces topological errors.
tolerance integer meters Tolerance value used to simplify the polygons. For more details, refer to the geopandas documentation. We recommend to visually inspect the results and adjust the value if necessary.
outputf character / Name of the output folder. It is created if it doesn’t exist.
outputGCPb boolean / Exports the punctual biases. ‘True’ or ‘False’ /
outputCSV_point_sim boolean / Exports the translated vertices. ‘True’ or ‘False’ This file may be large.
outputCSV_poly_sim boolean / Exports the reconstructed polygons. ‘True’ or ‘False’ This file may be large.
output_SVE boolean / Exports the interpolated SVE rasters. ‘True’ or ‘False’ This file may be large.

Outputs description

The following table lists the differents output files PickShift exports.

File name File format Description
GCP_bias .gpkg Corresponds to the GCPs picked from the targeted planimetric data, to which the punctual planimetric bias values are joined.
SVE_XY .tif Spatially-variable error in XY, i.e. spatial interpolation (IDW) of the punctual planimetric biases.
SVE_X .tif Spatially-variable error in X, i.e. spatial interpolation (IDW) of the punctual planimetric biases.
SVE_Y .tif Spatially-variable error in Y, i.e. spatial interpolation (IDW) of the punctual planimetric biases.
Buffer .gpkg Shapefile of the buffers from which the values of SVE are extracted around each polygons vertices.
point_sim_MC .csv Translated vertices resulting from the Monte-Carlo simulations.
poly_sim_MC .csv Reconstructed polygons from the translated vertices.
poly_MC .gpkg Aggregated results. Contains the initial polygons geometry, associated with the following statistics: mean surface, standard deviation, minimum, percentiles, maximum, total uncertainty (%), confidence interval, 95% uncertainty (%)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages