-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Polygon.contains(LineString) predicate logic. #1186
Improve Polygon.contains(LineString) predicate logic. #1186
Conversation
…pes due to slice construction.
…h group as a single test.
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
…erly.py Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com>
…polygon-linestring-contains
rhs_sizes_less_line_intersection_size[ | ||
rhs_sizes_less_line_intersection_size <= 0 | ||
] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Why is it possible that there can be rows that has fewer vertices in the linestring to the number of intersection points?
-
Why do you need to explicitly set these rows to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating this logic. This step is important to cover the tests of polygon.contains(linestring) where the LineString is contained only by the border off the polygon. In the above line, the number of points in the linestring is reduced by the number of points in the border-overlap region:
rhs_sizes_less_line_intersection_count = rhs.sizes - linestring_intersects_count
This is because LineString segments that overlap with the boundary of a polygon are not used in the counting of the vertices that may be contained in the polygon. Consider the following three examples:
Note that each shape has examples of the LineString overlapping the boundary of the polygon. In the first case, an edge overlaps from the corner, then goes inside to an interior point, then returns to the edge, then overlaps to the corner. The length of the LineString is 5. The first overlapping segment subtracts 2 from the length of the linestring, and the second overlapping segment subtracts another 2, leaving a linestring of length 1. 1 point is contained, and therefor the linestring is contained.
In the second example the same pattern occurs except the middle point goes out of the polygon instead of in. The same points are subtracted from the length of the linestring, but the number of contained points is 0, less than the (remaining) length of the linestring of 1, so it is not contained.
Finally, the LineString only overlaps with the edge of the polygon, on the left and on the bottom. This returns a LineString from pairwise-linestring-intersection of length 3, which is subtracted from the length of the LineString, also 3. rhs_sizes_less_line_intersection_size
is now 0 for this example LineString, so contains + point_intersects_count == 0
, which would return contains = True
. Therefore I set the minimum LineString length to 1, so that some LineString point must be contained in order for .contains
to evaluate to true.
Happy to chat this through with you in person if you can think of a more parsimonious approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that each shape has examples of the LineString overlapping the boundary of the polygon. In the first case, an edge overlaps from the corner, then goes inside to an interior point, then returns to the edge, then overlaps to the corner. The length of the LineString is 5. The first overlapping segment subtracts 2 from the length of the linestring, and the second overlapping segment subtracts another 2, leaving a linestring of length 1. 1 point is contained, and therefor the linestring is contained.
In the second example the same pattern occurs except the middle point goes out of the polygon instead of in. The same points are subtracted from the length of the linestring, but the number of contained points is 0, less than the (remaining) length of the linestring of 1, so it is not contained.
Pardon me from not following the reasoning here. I took linestring_intersects_count
meaning the number of vertices of the linestring that intersects with the ring of the polygon. Isn't for both cases, the value is 4? The subtracted result is 1 right?
Finally, the LineString only overlaps with the edge of the polygon, on the left and on the bottom. This returns a LineString from pairwise-linestring-intersection of length 3, which is subtracted from the length of the LineString, also 3. rhs_sizes_less_line_intersection_size is now 0 for this example LineString, so contains + point_intersects_count == 0, which would return contains = True. Therefore I set the minimum LineString length to 1, so that some LineString point must be contained in order for .contains to evaluate to true.
This sounds a bit hacky to me. Also, at least the comparison should be ==
not <=
right?
final_result = contains + point_intersects_count == ( | ||
rhs_sizes_less_line_intersection_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems differ from your statement in the PR description, your description basically says:
final_result = contains == rhs.sizes - linestring_points_on_boundary
To make this equal to the formula here, it seems like we need:
linestring_points_on_boundary == linestring_intersects_counts - point_intersects_count
Though I don't see the connection here. Can you show me some hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the description I believe I misspoke about the number of points that are contained in the polygon as demonstrated in the code:
final_result = contains + point_intersects_count == rhs.sizes - linestring_points_on_boundary
IIRC, you are concerned that linestring_points_on_boundary
in your example doesn't take into account the number of point intersections, which are ignored in the description of this PR but not in _compute_polygon_linestring_contains
.
contains
equals the number of the LineString vertices that are contained by the polygon.
point_intersects_count
equals the number of LineString vertices that intersect with the polygon: that is, the places where the LineString does not share an overlapping boundary with the polygon, but instead has a vertex that is colinear with the boundary of the polygon, but each segment that shares the vertex is not colinear. When a point_intersects_count
intersection occurs, there must either: exist a point that is inside the polygon that will be counted by contains
or a point outside of the polygon that will not be counted.
When a vertex is colinear with the boundary of the polygon this is counted as "in" the polygon for a reason related to our above conversation, where the minimum LineString length must be 1. Consider the following example:
In the first case, point_intersects_count == 2
, as the LineString vertices are colinear with the boundary of the polygon. contains + point_intersects_count == rhs.sizes
and the LineString is contained.
In the second case, point_intersects_count == 0
, as the LineString vertices are not colinear- it is a genuine intersection. contains + point_intersects_count < rhs.sizes
and the LineString is not contained.
linestring_intersects_count
is the number of LineString vertices that are from segments that are colinear with the polygon boundary. They are used only to reduce the number of points in the LineString that should be tested for .contains.
The final equation would be
final_result = contains + point_intersects_count == rhs.sizes - linestring_intersects_count
equal to
final_result = contains == rhs.sizes - linestring_intersects_count - point_intersects_count
as I think you are alluding to above, except that the LineString must maintain a size of at least 0 as you asked about above. Maybe I'm missing something here that would allow me to subtract point_intersects_count
instead of using the line that sets the minimum rhs_sizes_less_line_intersection_size
to 1. I'll be thinking of that.
Consider the final case in this response,
In this case point_intersects_count = 1
, and due to the overlapping segment, rhs_sizes_less_line_intersection_count == 1
, and the LineString is contained. I could do contains == rhs.sizes - linestring_intersects_count - point_intersects_count
, which in this case would be (0 == 0)
, but that would break the third case in my earlier response, where the LineString is entirely colinear with the boundary of the polygon. In those cases the LineString is not contained, and as such it must be that rhs_sizes_less_line_intersection_count > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you arrive at the same equality as I did above.
final_result = contains == rhs.sizes - linestring_intersects_count - point_intersects_count
Great. I'm still confused with the discrepency between this and code. But perhaps focus on Mark's question below first.
I suggest we use stricter mathematical language here.
aka:
aka:
Right? |
I don't think that's sufficient (but neither is the one written by Thomson that you quoted). Because of concavities and holes, even if all vertices of a linestring fall on the boundary or are contained by the polygon, intermediate points in the edges could fall outside of the polygon where it is concave or has holes. I think a more correct definition is:
This latter bit is tricky, because the linestring can pass completely outside and then back inside within a single segment. I think you must check whether all intersection points are also vertices of the linestring, and all intersection segments are colinear with polygon segments. |
python/cuspatial/cuspatial/tests/binpreds/binpred_test_dispatch.py
Outdated
Show resolved
Hide resolved
Resolves #1185 |
x x x | ||
|\\ | /| | ||
| xx- | | ||
| | | | ||
---x--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not right? Why the double backslash? Also missing x
vertices on the bottom edge of the poly. I think this is more like it.
x o x
|\ | /|
| \ |/ |
| x/* |
| | |
x-----o---x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double backslash is required for python inline comments like this, unfortunately. I like your diagram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is similar to C style backslash escape.
x------ | ||
| | | ||
| ---x | ||
| --- | | ||
x------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ox------x
| |
| /-*--o
| _/ |
ox------x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not ideal for ASCII...
|
||
See the docs for `_pli_points_to_multipoints` and | ||
`_pli_lines_to_multipoints` for the rest of the explanation. | ||
""" | ||
in_sizes = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this whole function needed because of the output format of pairwise_linestring_intersection? Perhaps we need to reconsider the output format of that function for efficiency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct.
/merge |
Fixes #1185
Description
The previous logic for handling the case of a LineString that shares boundary points with a Polygon and has no interior points was wrong, it was just finding the center point of such a LineString and testing that for containment. The correct logic is as follows:
A LineString is contained in a polygon if:
The sum of the points of the LineString that are contained in the Polygon, plus the number of vertices of the linestring that are on the polygon boundary, is equal to the length of the LineString less the number of points in the LineString that are contained in the boundary of the Polygon.
The new logic counts the number of point intersections and linestring intersection that the LineString shares with the polygon. Point intersections occur when the LineString shares an edge point with the Polygon exterior, implying either a LineString segment that approaches from the exterior or the interior. An interior point will be counted, an exterior point will be left in the size of the LineString during comparison, with the final count being less than the size of the LineString.
The hardest part of this implementation was in writing
_pli_features_rebuild_offsets
that takes the results of apairwise_linestring_intersection
result and splits them into row-appropriate points in one set and linestrings in another set. This is potentially a good place to move the logic into C++, though I don't think it will be a major profiling issue initially.Also improves
touches
andcrosses
where the predicate had been written too tightly to the test cases.In addition, a bug with
pairwise_multipoint_equals_count
is fixed whenlhs
contains more than 1 multipoints, but all multipoints are empty. The bug causes the API to raise a cuda invalid configuration error.Checklist