Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistently validate polygon inputs for GeoArrow offset format #973

Merged
merged 10 commits into from
Mar 9, 2023

Conversation

harrism
Copy link
Member

@harrism harrism commented Mar 7, 2023

Description

Fixes #962. Discussion: #972.

Adds a new C++ macro to validate polygon offset array sizes. Updates all C++ APIs that accepts polygons to use this macro for consistent validation and requirements. All C++ functions that accept polygon data should accept and enforce GeoArrow offset format ("N + 1 offsets").

Updates C++ tests to correctly pass offsets.

Updates Python implementation to no longer slice off final offsets.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added libcuspatial Relates to the cuSpatial C++ library Python Related to Python code labels Mar 7, 2023
@harrism harrism self-assigned this Mar 7, 2023
@harrism harrism added bug Something isn't working 3 - Ready for Review Ready for review by team breaking Breaking change labels Mar 7, 2023
@harrism harrism marked this pull request as ready for review March 7, 2023 05:40
@harrism harrism requested review from a team as code owners March 7, 2023 05:40
@harrism harrism requested review from isVoid, zhangjianting, thomcom and trxcllnt and removed request for zhangjianting March 7, 2023 05:40
@harrism harrism removed the 3 - Ready for Review Ready for review by team label Mar 7, 2023
Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good in general - my point-polygon distance can also benefit from this macro. Would you like to wait till the closedness discussion to close (no-pun intended) until we merge this PR?

cpp/include/cuspatial/detail/utility/validation.hpp Outdated Show resolved Hide resolved
@harrism
Copy link
Member Author

harrism commented Mar 8, 2023

I think this is good in general - my point-polygon distance can also benefit from this macro. Would you like to wait till the closedness discussion to close (no-pun intended) until we merge this PR?

I plan to update this based on that discussion, yes. Especially the change to the arithmetic for empty lists.

@harrism harrism requested a review from a team as a code owner March 8, 2023 11:41
@github-actions github-actions bot added the cmake Related to CMake code or build configuration label Mar 8, 2023
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving CMake changes

@harrism
Copy link
Member Author

harrism commented Mar 8, 2023

I think this is good in general - my point-polygon distance can also benefit from this macro. Would you like to wait till the closedness discussion to close (no-pun intended) until we merge this PR?

I plan to update this based on that discussion, yes. Especially the change to the arithmetic for empty lists.

Updated

cpp/src/spatial/point_in_polygon.cu Show resolved Hide resolved
cpp/src/spatial/polygon_bounding_box.cu Show resolved Hide resolved
cpp/tests/spatial/point_in_polygon_test.cpp Outdated Show resolved Hide resolved
harrism and others added 2 commits March 9, 2023 09:23
Co-authored-by: H. Thomson Comer <thomcom@gmail.com>
"Each ring must have at least four vertices");

CUSPATIAL_EXPECTS(test_points_x.size() == poly_offsets.size(),
CUSPATIAL_EXPECTS(test_points_x.size() == std::max(poly_offsets.size() - 1, 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since offsets must be greater or equal to 1, poly_offsets.size() - 1 >= 0. No need to wrap with std::max.

Copy link
Member Author

@harrism harrism Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to pass when an empty test_points column is passed, regardless of whether the polygon offsets are malformed. If an empty offsets array is also passed, this will be 0 == -1, and this will throw. Hence the need for std::max. Even though an empty polygon offsets column is not technically correct, I think we should consider it to mean "zero polygons" for this case where there are also zero test points.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok... Note that in header only API you added the test macro, if the offset is malformed the API will throw anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove std::max this test fails:

TYPED_TEST(PairwisePointInPolygonTest, Empty)

If I leave it in, it passes. I think we want this case to not throw.

@harrism
Copy link
Member Author

harrism commented Mar 9, 2023

/merge

@rapids-bot rapids-bot bot merged commit 25466ed into rapidsai:branch-23.04 Mar 9, 2023
rapids-bot bot pushed a commit that referenced this pull request Mar 13, 2023
Closes #958.  Replaces usage of GDAL `contains` in the quadtree point-in-polygon tests with a call to a different cuSpatial point-in-polygon function. This allows us to remove all current dependencies on GDAL in cuSpatial.

Depends on #973. This PR currently updates some expectations to make tests pass that are fixed in that PR.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - H. Thomson Comer (https://github.com/thomcom)
  - Paul Taylor (https://github.com/trxcllnt)
  - Jordan Jacobelli (https://github.com/jjacobelli)

URL: #974
trxcllnt pushed a commit to trxcllnt/cuspatial that referenced this pull request Mar 15, 2023
…dsai#973)

Fixes rapidsai#962. Discussion: rapidsai#972.

Adds a new C++ macro to validate polygon offset array sizes. Updates all C++ APIs that accepts polygons to use this macro for consistent validation and requirements.  All C++ functions that accept polygon data should accept and enforce GeoArrow offset format ("N + 1 offsets").

Updates C++ tests to correctly pass offsets.

Updates Python implementation to no longer slice off final offsets.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - H. Thomson Comer (https://github.com/thomcom)
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#973
trxcllnt pushed a commit to trxcllnt/cuspatial that referenced this pull request Mar 15, 2023
Closes rapidsai#958.  Replaces usage of GDAL `contains` in the quadtree point-in-polygon tests with a call to a different cuSpatial point-in-polygon function. This allows us to remove all current dependencies on GDAL in cuSpatial.

Depends on rapidsai#973. This PR currently updates some expectations to make tests pass that are fixed in that PR.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - H. Thomson Comer (https://github.com/thomcom)
  - Paul Taylor (https://github.com/trxcllnt)
  - Jordan Jacobelli (https://github.com/jjacobelli)

URL: rapidsai#974
rapids-bot bot pushed a commit that referenced this pull request Mar 17, 2023
…cts (#976)

- Adds `multipolygon_range`, `multipolygon_ref`, `polygon_ref` as non-owning objects for geoarrow compliant polygon types.

- Adds `pairwise_point_polygon_distance` to compute the shortest distances between two columns of multipoints and multipolygons.

- Refactors `is_point_in_polygon` with geometry object input. Dependent on #973 since geometry objects requires geoarrow input.

closes #703

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: #976
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change bug Something isn't working cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library Python Related to Python code
Projects
Status: Review
4 participants