Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEOS 3.7.1 to 3.7.2 tightening of validity needed for operations #1121

Closed
rsbivand opened this issue Aug 3, 2019 · 24 comments
Closed

GEOS 3.7.1 to 3.7.2 tightening of validity needed for operations #1121

rsbivand opened this issue Aug 3, 2019 · 24 comments

Comments

@rsbivand
Copy link
Member

rsbivand commented Aug 3, 2019

As rgeos maintainer (https://r-forge.r-project.org/projects/rgeos/), I was asked to look for causes of CRAN check failures for BayesX, birdring and inlmisc immediately following a system upgrade from GEOS 3.7.1 to 3.7.2 (the same problem is present in GEOS 3.8.0dev). I'm posting this issue to provide a reprex, and to document the resolution for rgeos 0.5-1 (rev. 603). The two WKT files (based on the error in inlmisc) are in this zipfile:
WKTs.zip

The initial script for CRAN releases of rgeos and sf is:
GEOS_3.7.2_3.7.1_test.zip

This gives the following output for the CRAN releases for GEOS 3.7.1 and GEOS 3.7.2:
script_output_3.7.1.txt
script_output_3.7.2.txt

As can be seen, GEOS 3.7.2 is stricter on topological operations than 3.7.1 was. This leads to failures which had not previously been seen for invalid geometries. On the hunch that a zero-width buffer might help, rgeos 0.5-1 (rev. 603):

install.packages("rgeos", repos="http://R-Forge.R-project.org")

and a modified script:
GEOS_3.7.2_3.7.1_test_2L.zip

now pass, informing that a geometry was invalid and that a zero-width buffer repair has been attempted; the issues in sf have not been addressed. I do not know whether other topology operations are affected, or whether predicates are affected (they do not seem to be so far).

@rsbivand
Copy link
Member Author

rsbivand commented Aug 4, 2019

Changes to packages using rgeos and affected here should probably use:

exists("get_RGEOS_CheckValidity") && (get_RGEOS_CheckValidity() == 1L) to test for rgeos 0.5-1 and GEOS >= 3.7.2. Then they may either use set_RGEOS_CheckValidity(2L) for global per session zero-width buffer repairs, or checkValidity=2L for per-function call repairs.

@rsbivand
Copy link
Member Author

rsbivand commented Aug 5, 2019

Rather:

if (requireNamespace("rgeos", quietly=TRUE) && packageVersion("rgeos") >= "0.5.1") {
if (rgeos::get_RGEOS_CheckValidity() == 1L) rgeos::set_RGEOS_CheckValidity(2L)
}

@rsbivand
Copy link
Member Author

rsbivand commented Aug 5, 2019

rgeos 0.5-1 submitted to CRAN (10:53 CEST), on CRAN 12:09 CEST - thanks to the CRAN team!

@dbaston
Copy link
Contributor

dbaston commented Aug 5, 2019

Most likely an unintended consequence of https://trac.osgeo.org/geos/ticket/789 or https://trac.osgeo.org/geos/ticket/838. GEOS doesn't make any claims about what happens when its inputs are invalid, so I'm not sure if the project would consider it a regression or not. It's also not clear to me without spending more time understanding the R code which geometries are actually making it into a failing GEOSIntersection call.

@rsbivand
Copy link
Member Author

rsbivand commented Aug 6, 2019

Thanks @dbaston ! I felt that the uncertainty about changes in GEOS, and whether they were intended or not, should be protected against. I agree that it is not easy to know which kinds of invalidity may have been digestable before 3.7.2. I did look at the commits but was no wiser from doing so. I suspect that perhaps before 3.7.2, things may have been too permissive. There are no reports that I can see on the GEOS mailing list of regressions, so I'm viewing this as GEOS tightening, probably to match its upstream JTS, so downstream should adapt - leading to the rgeos release a day ago.

@dbaston
Copy link
Contributor

dbaston commented Aug 6, 2019

FWIW, the two failing inputs are:

MULTIPOLYGON (((1.2 0.5, 2.5 5.1, 5.8 1.7, 1.2 0.5), (2.5 2.5, 3.4 1.8, 3.7 3.1, 2.5 2.5)), ((-0.3 3.3, -1 7, 1.7 5.1, -0.3 3.3)))

and

MULTIPOLYGON (((1 5, 2 5, 2 4, 2 3, 1 3, 1 2, 2 2, 2 3, 3 3, 3 2, 3 1, 2 1, 1 1, 0 1, 0 2, 0 3, 0 4, 0 5, 1 5)), ((5 2, 5 1, 5 0, 4 0, 4 1, 4 2, 5 2)))

This fails on recent JTS as well.

@rsbivand
Copy link
Member Author

rsbivand commented Aug 6, 2019

OK, thanks. So current GEOS is consistent with current JTS; is it worth seeing whether JTS has always found these failing, and GEOS has now come into line, or whether they've mostly been synchronized? Anyway, useful to know that this is unlikely to have been a regression since these inputs also fail on JTS.

@dbaston
Copy link
Contributor

dbaston commented Aug 6, 2019

Again, hard to call it a "regression" since the input is invalid, yet I'm curious what @dr-jts thinks.

@rsbivand
Copy link
Member Author

rsbivand commented Aug 6, 2019

I agree that "regression" is the wrong term, I used tightening, which maybe feels like moving closer to JTS in how operations treat invalidity.

@dr-jts
Copy link

dr-jts commented Aug 7, 2019

Are the invalid inputs to GEOS the result of prior computation in some other algorithm?

The GEOS/JTS requirement for valid geometry as input to operations is quite essential to provide efficiency and algorithmic simplicity. So if an upstream algorithm is producing invalid geometry, there needs to be an intermediate step to fix this. There's a spectrum of geometry invalid situations, which are more or less easy to fix. (For instance, self-touching rings have an obvious geometric interpretation, and are relatively simple to fix. Arbitrary self-intersection is harder to interpret and fix).

@rsbivand
Copy link
Member Author

rsbivand commented Aug 8, 2019

Yes, the affected R packages are generating geometries before getting to the calls to GEOS topological operations that fail in 3.7.2 but not in 3.7.1 or earlier. It feels as though the cut-off point in the spectrum of invalidity has tightened. It isn't a problem, and has been worked around in rgeos by enforcing validity checking where rgeos is built using GEOS >= 3.7.2, and offering zero-width buffering internally as a possible repair step. The difficulty was in not seeing a specific notice in the release notes that this might occur. Once CRAN had noticed the failures and notified me as rgeos maintainer, it was just a matter of bisecting back to a plausible cause, and adding mitigating steps.

By the way, it does show how effective CRAN's continuous testing is, contrasted with CI Travis-style - test on delta. CRAN tests everything against everything daily at the R and R package level, and has systems which update upstream packages (Debian, Fedora) like libcurl, etc., and including GEOS, GDAL, and PROJ. Then the CRAN team push notifications to package maintainers if they can identify the right person (here three failing packages all use rgeos built with the GEOS that had been updated).

For R geospatial packages, we do track PROJ, GDAL and GEOS masters, but do not check other packages using facilities provided by say rgeos or sf. Single-threaded on my desktop, reverse dependency checks for rgeos take over three hours, and then need manual collation. @edzer maybe we need to create something less manual to be up to CRAN speed?

@edzer
Copy link
Member

edzer commented Aug 8, 2019

It seems to be the requirement that the ring direction of the outer ring needs to be counter clockwise (CCW):

library(sf)
# Linking to GEOS 3.7.0, GDAL 2.4.0, PROJ 5.2.0
p = "MULTIPOLYGON (((1 5, 2 5, 2 4, 2 3, 1 3, 1 2, 2 2, 2 3, 3 3, 3 2, 3 1, 2 1, 1 1, 0 1, 0 2, 0 3, 0 4, 0 5, 1 5)), ((5 2, 5 1, 5 0, 4 0, 4 1, 4 2, 5 2)))"
st_as_sfc(p)[[1]]
# MULTIPOLYGON (((1 5, 2 5, 2 4, 2 3, 1 3, 1 2, 2 2, 2 3, 3 3, 3 2, 3 1, 2 1, 1 1, 0 1, 0 2, 0 3, 0 4, 0 5, 1 5)), ((5 2, 5 1, 5 0, 4 0, 4 1, 4 2, 5 2)))
sf:::check_ring_dir(st_as_sfc(p))[[1]]
# MULTIPOLYGON (((1 5, 0 5, 0 4, 0 3, 0 2, 0 1, 1 1, 2 1, 3 1, 3 2, 3 3, 2 3, 2 2, 1 2, 1 3, 2 3, 2 4, 2 5, 1 5)), ((5 2, 4 2, 4 1, 4 0, 5 0, 5 1, 5 2)))

In sf, it should be fairly trivial to do this check (and correct) by default - at a cost of course.

I believe I've looked for this several times in the simple feature access (part 1) standard document, but couldn't find it stated as a requirement, more as an implicit assumption, that outer rings are CCW; first ring being exterior seems sufficient for disambiguation. For rings on the sphere it is more important, as they divide the sphere in two parts rather than having a natural outside, as on the plane (though heuristics could assume the smaller part is inside).

@rsbivand
Copy link
Member Author

rsbivand commented Aug 8, 2019

The sp objects here are positive for sp::validObject(). And:

> st_is_valid(st_as_sfc(p)[[1]])
[1] FALSE
> st_is_valid(sf:::check_ring_dir(st_as_sfc(p))[[1]])
[1] FALSE

so as far as GEOS 3.7.2 is concerned, both are troubled.

@dbaston
Copy link
Contributor

dbaston commented Aug 8, 2019

SFA specifies CCW orientation for exterior rings; from section 6.1.1.1:

The exterior boundary LinearRing defines the “top” of the surface which is the side of the surface from which the exterior boundary appears to traverse the boundary in a counter clockwise direction. The interior LinearRings will have the opposite orientation, and appear as clockwise when viewed from the “top”,

GEOS actually uses the opposite convention for geometries that it produces, but is agnostic as to its inputs: http://www.tsusiatsoftware.net/jts/jts-faq/jts-faq.html#B6

All that to say that orientation isn't the issue here. The polygon is invalid because it has a self-touching exterior ring. This is the shapefile style for representing this geometry. GEOS expects the OGC representation, which would include a hole.

That the operation succeeds in 3.7.1 but not 3.7.2 is not the result of a change in validity standards; it's just that it happened to work in 3.7.1 and no longer happens to work in 3.7.2.

image

@edzer
Copy link
Member

edzer commented Aug 8, 2019

Thanks, that clarifies! But what then was the problem with

MULTIPOLYGON (((1.2 0.5, 2.5 5.1, 5.8 1.7, 1.2 0.5), (2.5 2.5, 3.4 1.8, 3.7 3.1, 2.5 2.5)), ((-0.3 3.3, -1 7, 1.7 5.1, -0.3 3.3)))

the failing input you reported above?

@dbaston
Copy link
Contributor

dbaston commented Aug 8, 2019

There is no problem with that one. GEOSIntersection is called with two arguments; one of them is valid, and one is not. The invalid one causes it to fail.

@edzer
Copy link
Member

edzer commented Aug 8, 2019

Thanks - I now understand your post.

@dr-jts
Copy link

dr-jts commented Aug 8, 2019

The requirement that rings not self-touch is actually just an optimization that allows skipping a scan of the input geometry to find self-nodes. It would be relatively easy to relax this, at the cost of somewhat reduced performance. Alternatively, the buffer(0) trick can be used as currently (although it might be better to provide an explicit function which just converts Shapefile-style semantics to SFA-style).

In fact, the forthcoming improved overlay algorithm will accept self-touching rings, due to the use of snap-rounding. So this problem may disappear again at some point.

@dbaston
Copy link
Contributor

dbaston commented Aug 8, 2019

@dr-jts I noticed that as of 2011, JTS/GEOS overlay worked on both ESRI-valid and OGC-valid polygons: https://sourceforge.net/p/jts-topo-suite/mailman/message/27048423/

Though you pointed out that the behavior isn't guaranteed, if the behavior held true up to 3.7.1, I wonder if we should attempt to preserve it in the 3.7.x line.

@dbaston
Copy link
Contributor

dbaston commented Aug 8, 2019

BTW, I would be happy to write a GEOSUnesrify() function provided that we can give it such a name.

@dr-jts
Copy link

dr-jts commented Aug 8, 2019

@dr-jts I noticed that as of 2011, JTS/GEOS overlay worked on both ESRI-valid and OGC-valid polygons: https://sourceforge.net/p/jts-topo-suite/mailman/message/27048423/

Yes, there was an optimization introduced at some point that caused self-touching rings to no longer work.

Though you pointed out that the behavior isn't guaranteed, if the behavior held true up to 3.7.1, I wonder if we should attempt to preserve it in the 3.7.x line.

That might be nice. Although depends on what change caused this "regression". If it was the optimization mentioned above that should be easy to revert. If it is a side-effect of something else that might not be so easy to fix.

@dbaston
Copy link
Contributor

dbaston commented Aug 8, 2019

If it is a side-effect of something else that might not be so easy to fix.

Pretty sure it must be a side-effect of libgeos/geos@609e764 or libgeos/geos@3528071, though I didn't do incremental builds to see which.

@dr-jts
Copy link

dr-jts commented Aug 9, 2019

I've reconstituted the history of the issue which lead to this change (in JTS and GEOS). The original issue was GEOS-838, which presented two valid geometries whose union was invalid. JTS-107 has more discussion about this. JTS-257 is the fix implementation.

The problem turned out to be a noding robustness issue, which caused the valid input linework to have a self-touch after noding. This caused the output to be invalid. The fix was to tighten up the internal overlay noding validation check to catch this situation. This has the side-effect of detecting (and failing) all self-touches in input geometry. Previously, vertex-vertex self-touches were not detected, and in many cases they would simply propagate through the overlay algorithm. (This made the output invalid as well, but since the inputs were already invalid this behaviour was considered acceptable).

Some conclusions are:

  • It would be possible to disable the additional noding vertex self-touch check, to revert to 3.7 behaviour. But this will have the effect of re-introducing the GEOS-838 failure issue.
  • To allow self-touching input and keep the fix for 838 it is necessary to fully node input geometry (which is not done currently, as a performance optimization). Perhaps this could be controlled by a flag provided by the client, to avoid performance degradation where not needed? * Actually, noding only vertex-vertex self-touches would be faster than full self-touch noding. This will require some additional code to implement, however
  • It might be worth thinking harder about whether the current noding algorithm can be improved to handling vertex-vertex self-touches
  • As noted previously, an alternative is to pre-process the input to convert the geometry to valid OGC format. This is essentially a subset of the functionality of MakeValid, but if restricted to handling only vertex-vertex self-touches it should be substantially more performant. This is also what the buffer(0) trick accomplishes. (Cleaning geometry using buffer(0) has a known issue in some cases, but the good news is that invalidity caused by self-touches is not affected by that problem.)

@rsbivand
Copy link
Member Author

Thanks very much for a comprehensive account. On the R side, we can take steps to convince package authors to check and correct validity before passing objects to rgeos and GEOS, and probably a blog linked from error messages in sf and rgeos. We can also advise use of the LWGEOM-based function lwgeom::st_make_valid() should zero-width buffering not work. Unless other downstream software using GEOS/JTS needs relaxation, I'd judge that the arguments for tightening are well-motivated, not least to avoid non-performant checking in GEOS/JTS; I feel that the downstream software should sort out geometry problems where possible, keeping GEOS efficient for compliant geometries.

Anyway, many thanks for this most helpful discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants