Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and consolidate header organization #1084

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
76e9f02
Move vec_2d, box and segment headers to cuspatial/geometry
harrism Apr 18, 2023
3b65a48
Move the rest of geometry and detail/geometry out of experimental
harrism Apr 18, 2023
75cf840
Move geometry_collections and detail/geometry_collections
harrism Apr 18, 2023
cf78b2b
Move ranges and detail/ranges out of experimental
harrism Apr 18, 2023
3c6c498
Move experimental/detail/algorithm
harrism Apr 19, 2023
d1cb4fc
Move experimental/detail/find/
harrism Apr 19, 2023
fae0745
Move the rest of experimental/detail
harrism Apr 19, 2023
5a0708f
Move the rest of experimental headers
harrism Apr 19, 2023
5f6a792
consolidate tests
harrism Apr 19, 2023
08a2e73
doc updates
harrism Apr 19, 2023
aa00e62
More doc updates
harrism Apr 19, 2023
3f9016c
style
harrism Apr 19, 2023
d01e36d
consolidate bounding boxes headers and name consistently
harrism Apr 19, 2023
b5eb42e
Consolidate iterator factories
harrism Apr 19, 2023
cde594d
Consolidate trajectory headers
harrism Apr 19, 2023
c4f4b4f
move distance headers to distance directory
harrism Apr 19, 2023
42e8d07
Consolidate and DRY up point_in_polygon APIs
harrism Apr 19, 2023
4fe218c
style
harrism Apr 19, 2023
5840bb9
Merge branch 'branch-23.06' into fea-improve-header-organization
harrism Apr 20, 2023
8a5ec0e
Merge branch 'fea-improve-header-organization' into feature/header-or…
harrism Apr 20, 2023
02e7e81
Move missed distance header
harrism Apr 20, 2023
9051fce
Merge branch 'branch-23.06' into fea-improve-header-organization
harrism Apr 26, 2023
f10bc93
Update linestring_polygon_distance with new header locations
harrism Apr 26, 2023
239860c
Merge branch 'fea-improve-header-organization' into feature/header-or…
harrism Apr 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,9 @@ add_library(cuspatial
src/join/quadtree_point_in_polygon.cu
src/join/quadtree_point_to_nearest_linestring.cu
src/join/quadtree_bbox_filtering.cu
src/spatial/polygon_bounding_box.cu
src/spatial/linestring_bounding_box.cu
src/spatial/linestring_bounding_boxes.cu
src/spatial/polygon_bounding_boxes.cu
src/spatial/point_in_polygon.cu
src/spatial/pairwise_point_in_polygon.cu
src/spatial_window/spatial_window.cu
src/spatial/haversine.cu
src/spatial/hausdorff.cu
Expand Down
9 changes: 4 additions & 5 deletions cpp/benchmarks/pairwise_linestring_distance.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,10 @@
#include <benchmarks/fixture/rmm_pool_raii.hpp>
#include <nvbench/nvbench.cuh>

#include <cuspatial/detail/iterator.hpp>
#include <cuspatial/experimental/iterator_factory.cuh>
#include <cuspatial/experimental/linestring_distance.cuh>
#include <cuspatial/experimental/ranges/multilinestring_range.cuh>
#include <cuspatial/vec_2d.hpp>
#include <cuspatial/distance/linestring_distance.cuh>
#include <cuspatial/geometry/vec_2d.hpp>
#include <cuspatial/iterator_factory.cuh>
#include <cuspatial/range/multilinestring_range.cuh>

#include <rmm/device_vector.hpp>
#include <rmm/exec_policy.hpp>
Expand Down
4 changes: 2 additions & 2 deletions cpp/benchmarks/point_in_polygon.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@
#include <benchmarks/fixture/rmm_pool_raii.hpp>
#include <nvbench/nvbench.cuh>

#include <cuspatial/experimental/point_in_polygon.cuh>
#include <cuspatial/vec_2d.hpp>
#include <cuspatial/geometry/vec_2d.hpp>
#include <cuspatial/point_in_polygon.cuh>

#include <rmm/device_vector.hpp>
#include <rmm/exec_policy.hpp>
Expand Down
6 changes: 3 additions & 3 deletions cpp/benchmarks/points_in_range.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@

#include <cuspatial_test/random.cuh>

#include <cuspatial/detail/iterator.hpp>
#include <cuspatial/error.hpp>
#include <cuspatial/experimental/points_in_range.cuh>
#include <cuspatial/vec_2d.hpp>
#include <cuspatial/geometry/vec_2d.hpp>
#include <cuspatial/iterator_factory.cuh>
#include <cuspatial/points_in_range.cuh>

#include <rmm/device_uvector.hpp>
#include <rmm/exec_policy.hpp>
Expand Down
4 changes: 2 additions & 2 deletions cpp/benchmarks/quadtree_on_points.cu
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@
* limitations under the License.
*/

#include <cuspatial/experimental/point_quadtree.cuh>
#include <cuspatial/vec_2d.hpp>
#include <cuspatial/geometry/vec_2d.hpp>
#include <cuspatial/point_quadtree.cuh>

#include <benchmarks/fixture/rmm_pool_raii.hpp>
#include <nvbench/nvbench.cuh>
Expand Down
4 changes: 2 additions & 2 deletions cpp/doxygen/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -819,7 +819,7 @@ INPUT = main_page.md \
developer_guide/DOCUMENTATION.md \
developer_guide/DEVELOPER_GUIDE.md \
developer_guide/TESTING.md \
developer_guide/REFACTORING_GUIDE.md \
developer_guide/HEADER_ONLY_API_GUIDE.md \
../include

# This tag can be used to specify the character encoding of the source files
Expand Down Expand Up @@ -1155,7 +1155,7 @@ HTML_FOOTER =
# obsolete.
# This tag requires that the tag GENERATE_HTML is set to YES.

HTML_STYLESHEET =
HTML_STYLESHEET =

# The HTML_EXTRA_STYLESHEET tag can be used to specify additional user-defined
# cascading style sheets that are included after the standard style sheets
Expand Down
25 changes: 12 additions & 13 deletions cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ refer to these additional files for further documentation of libcuspatial best p
* [Documentation Guide](DOCUMENTATION.md) for guidelines on documenting libcuspatial code.
* [Testing Guide](TESTING.md) for guidelines on writing unit tests.
* [Benchmarking Guide](BENCHMARKING.md) for guidelines on writing unit benchmarks.
* [Refactoring Guide](REFACTORING_GUIDE.md) for guidelines on refactoring legacy column-based APIs
into header-only APIs.
* [Header-only API Guide](HEADER_ONLY_API_GUIDE.md) for guidelines on the header-only API and
column-based API.

# Overview

Expand Down Expand Up @@ -38,27 +38,26 @@ TODO: add terms

External/public libcuspatial APIs are grouped based on functionality into an appropriately titled
header file in `cuspatial/cpp/include/cuspatial/`. For example,
`cuspatial/cpp/include/cuspatial/coordinate_transform.hpp` contains the declarations of public API
functions related to transforming coordinates. Note the `.hpp` file extension used to indicate a
C++ header file that can be included from a `.cpp` source file.
`cuspatial/cpp/include/cuspatial/projection.hpp` contains the declarations of public API
functions related to coordinate projection transforms. Note the `.hpp` file extension used to
indicate a C++ header file that can be included from a `.cpp` source file.

Header files should use the `#pragma once` include guard.

The naming of public column-based cuSpatial API headers should be consistent with the name of the
folder that contains the source files that implement the API. For example, the implementation of the
APIs found in `cuspatial/cpp/include/cuspatial/trajectory.hpp` are located in
`cuspatial/src/trajectory`. This rule obviously does not apply to the header-only API, since the
headers are the source files.
The folder that contains the source files that implement an API should be named consistently with
the name of the of the header for the API. For example, the implementation of the APIs found in
`cuspatial/cpp/include/cuspatial/trajectory.hpp` are located in `cuspatial/src/trajectory`. This
rule obviously does not apply to the header-only API, since the headers are the source files.

Likewise, unit tests reside in folders corresponding to the names of the API headers, e.g.
trajectory.hpp tests are in `cuspatial/tests/trajectory/`.

Internal API headers containing `detail` namespace definitions that are used across translation
units inside libcuspatial should be placed in `include/cuspatial/detail`.

Note that (currently) header-only API files are in `include/cuspatial/experimental`, and their tests
are in `tests/experimental`. When the header-only refactoring is complete these should be renamed or
split into a separate library.
Header-only API files and column-based API headers are stored together in `include/cuspatial`. The
former use the `.cuh` extension because they almost universally require CUDA compilation. The latter
use the `.hpp` extension because they can be compiled with a standard C++ compiler.

## File extensions

Expand Down
Original file line number Diff line number Diff line change
@@ -1,33 +1,34 @@
# cuSpatial C++ API Refactoring Guide
# cuSpatial C++ header-only API Guide

The original cuSpatial C++ API (libcuspatial) was designed to depend on RAPIDS libcudf and use
its core data types, especially `cudf::column`. For users who do not also use libcudf or other
RAPIDS APIS, depending on libcudf could be a big barrier to adoption of libcuspatial. libcudf is
its core data types, especially `cudf::column`. For users who do not also use libcudf or other
RAPIDS APIS, depending on libcudf could be a big barrier to adoption of libcuspatial. libcudf is
a very large library and building it takes a lot of time.

Therefore, we are developing a standalone libcuspatial C++ API that does not depend on libcudf. This
is a header-only template API with an iterator-based interface. This has a number of advantages
Therefore, the core of cuSpatial is now implemented in a standalone C++ API that does not depend on
libcudf. This is a header-only template API with an iterator- and range-based interface. This has a
number of advantages.

1. With a header-only API, users can include and build exactly what they use.
2. With a templated API, the API can be flexible to support a variety of basic data types, such
2. With a templated API, the API can be flexible to support a variety of basic data types, such
as float and double for positional data, and different integer sizes for indices.
3. By templating on iterator types, cuSpatial algorithms can be fused with transformations of the
input data, by using "fancy" iterators. Examples include transform iterators and counting
iterators.
4. Memory resources only need to be part of APIs that allocate temporary intermediate storage.
Output storage is allocated outside the API and an output iterator is passed as an argument.
Output storage is allocated outside the API and an output iterator is passed as an argument.

The main disadvantages of this type of API are

1. Header-only APIs can increase compilation time for code that depends on them.
2. Some users (especially our Python API) may prefer a cuDF-based API.
2. Some users (especially the cuSpatial Python API) may prefer a cuDF-based API.

The good news is that by maintaining the existing libcudf-based C++ API as a layer above the header-
only libcuspatial API, we can avoid problem 1 and problem 2 for users of the legacy API.
The good news is that maintaining the existing libcudf-based C++ API as a layer above the header-
only libcuspatial API avoids problem 1 and problem 2 for users of the column-based API.

## Example API

Following is an example iterator-based API for `cuspatial::haversine_distance`. (See below for
Following is an example iterator-based API for `cuspatial::haversine_distance`. (See below for
discussion of API documentation.)

```c++
Expand All @@ -47,7 +48,7 @@ OutputIt haversine_distance(LonLatItA a_lonlat_first,
There are a few key points to notice.

1. The API is very similar to STL algorithms such as `std::transform`.
2. All array inputs and outputs are iterator type templates.
2. All array inputs and outputs are iterator type templates.
3. Longitude/Latitude data is passed as array of structures, using the `cuspatial::vec_2d`
type (include/cuspatial/vec_2d.hpp). This is enforced using a `static_assert` in the function
body (discussed later).
Expand All @@ -60,7 +61,7 @@ There are a few key points to notice.
7. The size of the input and output ranges in the example API are equal, so the start and end of
only the A range is provided (`a_lonlat_first` and `a_lonlat_last`). This mirrors STL APIs.
8. This API returns an iterator to the element past the last element written to the output. This
is inspired by `std::transform`, even though as with `transform`, many uses of
is inspired by `std::transform`, even though as with `transform`, many uses of
`haversine_distance` will not need this returned iterator.
9. All APIs that run CUDA device code (including Thrust algorithms) or allocate memory take a CUDA
stream on which to execute the device code and allocate memory.
Expand Down Expand Up @@ -103,7 +104,7 @@ Following is the (Doxygen) documentation for the above `cuspatial::haversine_dis
* otherwise.
* @pre `b_lonlat_first` may equal `distance_first`, but the range `[b_lonlat_first, b_lonlat_last)`
* shall not overlap the range `[distance_first, distance_first + (b_lonlat_last - b_lonlat_last))
* otherwise.
* otherwise.
* @pre All iterators must have the same `Location` type, with the same underlying floating-point
* coordinate type (e.g. `cuspatial::vec_2d<float>`).
*
Expand All @@ -119,13 +120,13 @@ Key points:
2. All parameters and all template parameters are documented.
3. States the C++ standard iterator concepts that must be implemented, and that iterators must be
device-accessible.
4. Documents requirements as preconditions using `@pre`.
4. Documents requirements as preconditions using `@pre`.
5. Uses preconditions to explicitly document what input ranges are allowed to overlap.
6. Documents the units of any inputs or outputs that have them.

## cuSpatial libcudf-based C++ API (legacy API)

This is the existing API, unchanged by refactoring. Here is the existing
This is the existing API, unchanged by refactoring. Here is the existing
`cuspatial::haversine_distance`:

```c++
Expand All @@ -142,47 +143,44 @@ OutputIt haversine_distance(LonLatItA a_lonlat_first,
```

key points:
1. All input data are `cudf::column_view`. This is a type-erased container so determining the
1. All input data are `cudf::column_view`. This is a type-erased container so determining the
type of data must be done at run time.
2. All inputs are arrays of scalars. Longitude and latitude are separate.
2. All inputs are arrays of scalars. Longitude and latitude are separate.
3. The output is a returned `unique_ptr<cudf::column>`.
4. The output is allocated inside the function using the passed memory resource.
5. The public API does not take a stream. There is a `detail` version of the API that takes a
stream. This follows libcudf, and may change in the future.

## File Structure

For now, libcuspatial APIs should be defined in a header file in the
`cpp/include/cuspatial/experimental/` directory. Later, as we adopt the new API, we will rename
the `experimental` directory. The API header should be named after the API. In the example,
`haversine.hpp` defines the `cuspatial::haversine_distance` API.
libcuspatial APIs should be defined in a header file in the `cpp/include/cuspatial/` directory.
The API header should be named after the API. In the example, `haversine.hpp` defines the `cuspatial::haversine_distance` API.

The implementation must also be in a header, but should be in the `cuspatial/experimental/detail`
directory. The implementation should be included from the API definition file, at the end of the
file. Example:
The implementation must also be in a header, but should be in the `cuspatial/detail` directory. The
implementation should be included from the API definition file, at the end of the file. Example:

```c++
... // declaration of API above this point
#include <cuspatial/experimental/detail/haversine.hpp>
#include <cuspatial/detail/haversine.hpp>
```

## Namespaces

Public APIs are in the `cuspatial` namespace. Note that both the header-only API and the libcudf-
based API can live in the same namespace, because they are non-ambiguous (very different
based API can live in the same namespace, because they are non-ambiguous (very different
parameters).

Implementation of the header-only API should be in a `cuspatial::detail` namespace.

## Implementation

The main implementation should be in detail headers.
The main implementation should be in detail headers.

### Header-only API Implementation

Because it is a statically typed API, the header-only implementation can be much simpler than the
libcudf-based API, which requires run-time type dispatching. In the case of `haversine_distance`, it is
a simple matter of a few static asserts and dynamic expectation checks, followed by a call to
Because it is a statically typed API, the header-only implementation can be much simpler than the
libcudf-based API, which requires run-time type dispatching. In the case of `haversine_distance`,
it is a simple matter of a few static asserts and dynamic expectation checks, followed by a call to
`thrust::transform` with a custom transform functor.

```c++
Expand Down Expand Up @@ -216,17 +214,17 @@ OutputIt haversine_distance(LonLatItA a_lonlat_first,
```

Note that we `static_assert` that the types of the iterator inputs match documented expectations.
We also do a runtime check that the radius is positive. Finally we just call `thrust::transform`,
We also do a runtime check that the radius is positive. Finally we just call `thrust::transform`,
passing it an instance of `haversine_distance_functor`, which is a function of two `vec_2d<T>`
inputs that implements the Haversine distance formula.

### libcudf-based API Implementation

The substance of the refactoring is making the libcudf-based API a wrapper around the header-only
API. This mostly involves replacing business logic implementation in the type-dispatched functor
with a call to the header-only API. We also need to convert disjoint latitude and longitude inputs
The substance of the refactoring is making the libcudf-based API a wrapper around the header-only
API. This mostly involves replacing business logic implementation in the type-dispatched functor
with a call to the header-only API. We also need to convert disjoint latitude and longitude inputs
into `vec_2d<T>` structs. This is easily done using the `cuspatial::make_vec_2d_iterator` utility
provided in `type_utils.hpp`.
provided in `type_utils.hpp`.

So, to refactor the libcudf-based API, we remove the following code.

Expand Down Expand Up @@ -271,6 +269,6 @@ cuspatial::haversine_distance(lonlat_a,
Existing libcudf-based API tests can mostly be left alone. New tests should be added to exercise
the header-only API separately in case the libcudf-based API is removed.

Note that tests, like the header-only API, should not depend on libcudf or libcudf_test. The
Note that tests, like the header-only API, should not depend on libcudf or libcudf_test. The
cuDF-based API made the mistake of depending on libcudf_test, which results in breakages
of cuSpatial sometimes when libcudf_test changes.
Loading