Generate watercourse_100mseg.gpkg #44

florisvdh · 2020-11-27T13:48:31Z

Initial code added.

I plan to add/do:

code necessary to uniquely identify corresponding segments and endpoints with a common, unique code
~~steps to write / read intermediate output from GRASS as gpkg on GDrive~~ (unneeded since data source has been entirely prepared in GRASS)
~~first steps to attach GRTSmaster_habitats address to points~~ (chosen to leave this out of this processed data source since it is specific to the handling of the 3260 sampling frame, it would not be the generally recommendable practice to make unique (c.q. spatially balanced) IDs for these segments and points)
~~code to further clean variable names and simplify the data in some standardized way~~ (unneeded since data source has been entirely prepared in GRASS)
code to write the result to disk (maybe as one GPKG)
first checks on the result
better referral to watercourses (consolidate raw data source) - see https://doi.org/10.5281/zenodo.4420905
provide a function read_watercourse_100mseg() in n2khab
consolidate the processed data source on Zenodo - see https://doi.org/10.5281/zenodo.4452578

@ToonHub I'll let you know when the processed data source is ready.

First impression of segments + points (from GRASS gui):

Screenshot

…within GRASS

…(from GRASS)

florisvdh · 2020-11-30T16:11:56Z

Compiled bookdown: generating_watercourse_100mseg.html.zip

@ToonHub I generated a first version of this processed data source; can you have a look at it? It is derived from the raw data source watercourse_segments (version watercourse_segments_20180601), which is still to be referred online (ideally Zenodo), it currently sits below this GDrive link. It represents the 'VHAS' subdataset ('waterloopsegmenten') and is identical to the 'Wlas' shapefile in the VHA_201806 GDrive folder that you provided, except for the filename standardization (and the WOR file I think).

The dataset has two layers (100m segments and endpoints). The two attribute variables of both layers are explained in the text.

I propose not to add GRTSmaster_habitats addresses inside this data source. It would unnecessarily inflate the data source, while it is only done in the context of the 3260 sampling frame. Another reason is that we best keep processed data sources 'multi-purpose' and therefore more generic; in this case the GRTSmaster_habitats approach would not lead to unique ID's without further tricks and spatially balanced addresses for lines can also be assigned in other ways (methods for lines exist).

ToonHub · 2020-12-02T08:46:47Z

I checked watercourse_100mseg.gpkg, both the points and lines. It looks fine to me, but there is one issue I am not sure about.
Currently, the selection of 100 m segments starts from the most upstream parts and when a segment stops when two watercourse come together. The end points of the created segments will be the starting points of the actual sampling units.
This sometimes results in very short segments when two watercourse come together. See example below.
Wouldn't is be better to start selecting segments from the most downstream parts and select the starting points?

Another thing.
What behaviour do we want when two watercourses come together? Do we want that two new segments start (in the upstreams direction)? That is the way it is now.
Or do we want that the segment in the 'main' watercourse (for example a 1 st order watercourse) continues and that a new segment starts for 2nd order water course that flows into the main watercourse?
See example below. It show a 1st order watercourse and two second order watercourses. At one of the branches a very short segment starts.

florisvdh · 2020-12-02T18:35:44Z

A roundup of new plans (after discussion):

consider starting from the watercourses dataset (Vhag.shp; pink below) instead of watercourse_segments (blue below). This will result in fewer segments shorter than 100 m.
define 100 m segments in the other direction, i.e. from linestring end towards linestring beginning. This will place the shorter segments at the upstream ('source') side. This can be done by first flipping the linestrings (v.edit tool=flip in GRASS) before running v.split.
as we want to retain the same relationship between points and (upstream) corresponding segment, the startpoints then have to be created instead of the endpoints.

Notes: it appears that:

watercourses and watercourse_segments coincide for the directions. However, there are clear differences as well, as seen below. Each has unique features that the other hasn't, but especially watercourses has most features. So their coverage is different and the one of watercourse_segments seems much worse.
many shorter watercourse segments still exist in watercourses - the total number of lines is still 20894. This partly has to do with networked patterns, i.e. another watercourse that connects two locations of the a watercourse, but also unconcluded / unfinished cases it seems, where merges would seem possible in one way or another (in order to achieve a longer 'main' watercourse).
- a consequence is that there will be a gain by using watercourses, but it will still result in a lot of <100m segments (20894)

Screenshot 1: watercourses + watercourse_segments 2018

I also compared this version (1 Jun 2018) with the current version (7 Aug 2020):

the watercourses (26651 instead of 20894 lines) and watercourse_segments (61783 instead of 49465 lines) layers have been much extended. Also it seems that the above mentioned mismatches have been resolved in watercourse_segments

Screenshot 2: watercourses + watercourse_segments 2020

My current conclusion is to proceed with watercourses, version 7 Aug 2020).

…ents * See Generate watercourse_100m...: uses watercourses, not watercourse_segments

Main changes: - start with version watercourses_20200807 - make a backup of it - flip its direction before splitting into segments - because of the flip, create startpoints instead of endpoints

florisvdh · 2020-12-17T16:25:40Z

@ToonHub I updated the processed data source. Can you have a look?

Compiled bookdown: generating_watercourse_100mseg.html.zip

It is now derived from the raw data source watercourses (version watercourses_20200807), which is ~~still to be referred online (ideally Zenodo)~~ at https://doi.org/10.5281/zenodo.4420905 and which ~~currently~~ also sits below this GDrive link. It represents the 'VHAG' subdataset ('waterlopen') of the VHA of 7 Aug 2020 at Geopunt.

For the rest previous comments still apply:

The dataset has two layers (100m segments and endpoints). The two attribute variables of both layers are explained in the text.

I propose not to add GRTSmaster_habitats addresses inside this data source. It would unnecessarily inflate the data source, while it is only done in the context of the 3260 sampling frame. Another reason is that we best keep processed data sources 'multi-purpose' and therefore more generic; in this case the GRTSmaster_habitats approach would not lead to unique ID's without further tricks, and spatially balanced addresses for lines can also be assigned in other ways (methods for lines exist).

Screenshot

ToonHub

I checked both layers in watercourse_100mseg. Looks fine!

florisvdh changed the title ~~Generate watercourse_segments & watercourse_segmentpoints~~ Generate watercourse_100msegments & watercourse_100msegmentpoints Nov 27, 2020

florisvdh changed the title ~~Generate watercourse_100msegments & watercourse_100msegmentpoints~~ Generate watercourse_100mseg & watercourse_100msegpoints Nov 27, 2020

Generate watercourse_100mseg & watercourse_100msegpoints: initial code

d841380

florisvdh force-pushed the watercourse_segments branch from 1909d9c to d841380 Compare November 27, 2020 14:46

florisvdh added 6 commits November 27, 2020 16:24

Generate watercourse_100m...: minor updates & fixes

a6345ea

Generate watercourse_100m...: elaborate GRASS geoprocessing

9f4ad02

Generate watercourse_100m...: improve GRASS setup

40ab1e5

Generate watercourse_100m...: minor additions

3cc7615

Generate watercourse_100m...: add more narrative

5cc5343

Generate watercourse_100m...: add extra paragraph titles

afcb7ae

florisvdh force-pushed the watercourse_segments branch from 310ef6f to afcb7ae Compare November 27, 2020 18:57

florisvdh added 11 commits November 27, 2020 20:13

Generate watercourse_100m...: rename variable to 'grassdbase_exists'

2b525f3

Generate watercourse_100m...: textual fixes

5fc7da0

Generate watercourse_100m...: better attr table preparation directly …

59e43b9

…within GRASS

Generate watercourse_100m...: export both layers into the final GPKG …

9f23b77

…(from GRASS)

Generate watercourse_100m...: add some checks on the data source

288a637

Generate watercourse_100m...: add parameter grass_reexport

ea7281b

Generate watercourse_100m...: drop mapview, add stringr

7105062

Generate watercourse_100m...: compute checksums of data source

6c133bb

Generate watercourse_100m...: use renv

30faabf

Generate watercourse_100m...: more explanation

19b565c

Generate watercourse_100m...: section about attribute variables

f542d7c

florisvdh marked this pull request as ready for review November 30, 2020 16:12

florisvdh requested a review from ToonHub November 30, 2020 16:12

florisvdh changed the title ~~Generate watercourse_100mseg & watercourse_100msegpoints~~ Generate watercourse_100mseg.gpkg Nov 30, 2020

florisvdh mentioned this pull request Dec 16, 2020

Segmentize grass: add proof of concept v.to.point #43

Merged

Generate watercourse_100m...: uses watercourses, not watercourse_segm…

749e3ef

…ents * See Generate watercourse_100m...: uses watercourses, not watercourse_segments

florisvdh added 11 commits December 16, 2020 16:15

Generate watercourse_100m...: take watercourses from n2khab_data

4044c83

Generate watercourse_100m...: improved GRASS setup

fd75d91

Generate watercourse_100m...: enforce EPSG:31370 in GRASS location

5e18861

Generate watercourse_100m...: use execshell function

3bd1bfe

Generate watercourse_100m...: use watercourses_20200807

5e06b69

Generate watercourse_100m...: fix in code for v.out.ogr

71722b7

Generate watercourse_100m...: minor text fixes

5721395

Generate watercourse_100m...: updates in the execution *

b7cd3a3

Main changes: - start with version watercourses_20200807 - make a backup of it - flip its direction before splitting into segments - because of the flip, create startpoints instead of endpoints

Generate watercourse_100m...: read_sf: take EPSG:31370 from GPKG

a97d9cc

Generate watercourse_100m...: adjust cartographic extent

9df7895

Generate watercourse_100m...: improve PROJ in sessioninfo report

46c742b

Generate watercourse_100m...: add renv::restore()

dc4d4d4

ToonHub approved these changes Dec 21, 2020

View reviewed changes

florisvdh merged commit 6b1d8f7 into master Jan 4, 2021

florisvdh deleted the watercourse_segments branch January 4, 2021 18:08

florisvdh mentioned this pull request Jan 20, 2021

New function: read_watercourse_100mseg() inbo/n2khab#105

Merged

florisvdh mentioned this pull request Feb 10, 2021

Prepare version 0.4.0 inbo/n2khab#111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate watercourse_100mseg.gpkg #44

Generate watercourse_100mseg.gpkg #44

florisvdh commented Nov 27, 2020 •

edited

Loading

florisvdh commented Nov 30, 2020 •

edited

Loading

ToonHub commented Dec 2, 2020

florisvdh commented Dec 2, 2020 •

edited

Loading

florisvdh commented Dec 17, 2020 •

edited

Loading

ToonHub left a comment

Generate watercourse_100mseg.gpkg #44

Generate watercourse_100mseg.gpkg #44

Conversation

florisvdh commented Nov 27, 2020 • edited Loading

florisvdh commented Nov 30, 2020 • edited Loading

ToonHub commented Dec 2, 2020

florisvdh commented Dec 2, 2020 • edited Loading

florisvdh commented Dec 17, 2020 • edited Loading

ToonHub left a comment

Choose a reason for hiding this comment

florisvdh commented Nov 27, 2020 •

edited

Loading

florisvdh commented Nov 30, 2020 •

edited

Loading

florisvdh commented Dec 2, 2020 •

edited

Loading

florisvdh commented Dec 17, 2020 •

edited

Loading