From 8dc1ecba592c166a4c677c597926042955abcc54 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 7 Sep 2022 16:16:07 +1000 Subject: [PATCH 01/20] Add DOCUMENTATION.md --- cpp/doc/DOCUMENTATION.md | 563 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 563 insertions(+) create mode 100644 cpp/doc/DOCUMENTATION.md diff --git a/cpp/doc/DOCUMENTATION.md b/cpp/doc/DOCUMENTATION.md new file mode 100644 index 000000000..a4911a925 --- /dev/null +++ b/cpp/doc/DOCUMENTATION.md @@ -0,0 +1,563 @@ +# libcuspatial C++ Documentation Guide + +These guidelines apply to documenting all libcuspatial C++ source files using doxygen style +formatting although only public APIs and classes are actually +[published](https://docs.rapids.ai/api/libcuspatial/stable/index.html). + +## Copyright License + +The following is the license header comment that should appear at the beginning of every C++ +source file. + +```c++ +/* + * Copyright (c) 2022, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +``` + +The comment should start with `/*` and not `/**` so it is not processed by doxygen. + +Also, here are the rules for the copyright year. + +- A new file should have the year in which it was created +- A modified file should span the year it was created and the year it was modified (e.g. + `2019-2021`) + +Changing the copyright year may not be necessary if no content has changed (e.g. reformatting only). + +## Doxygen + +The [doxygen tool](http://www.doxygen.nl/manual/index.html) is used to generate HTML pages from the +C++ comments in the source code. Doxygen recognizes and parses block comments and performs +specialized output formatting when it encounters +[doxygen commands](http://www.doxygen.nl/manual/commands.html). + +There are almost 200 commands (also called tags in this document) that doxygen recognizes in comment +blocks. This document provides guidance on which commands/tags to use and how to use them in the +libcuspatial C++ source code. + +The doxygen process can be customized using options in the [Doxyfile](../doxygen/Doxyfile). + +Here are some of the custom options in the Doxyfile for libcuspatial. +| Option | Setting | Description | +| ------ | ------- | ----------- | +| PROJECT_NAME | libcuspatial | Title used on the main page | +| PROJECT_NUMBER | 22.10.00 | Version number | +| EXTENSION_MAPPING | cu=C++ cuh=C++ | Process `cu` and `cuh` as C++ | +| INPUT | main_page.md ../include | Embedded markdown files and source code directories to process | +| FILE_PATTERNS | *.cpp *.hpp *.h *.c *.cu *.cuh | File extensions to process | + +## Block Comments + +Use the following style for block comments describing functions, classes and other types, groups, +and files. + +```c++ +/** + * description text and + * doxygen tags go here + */ +``` + +Doxygen comment blocks start with `/**` and end with `*/` only, and with nothing else on those +lines. Do not add dashes `-----` or extra asterisks `*****` to the first and last lines of a doxygen +block. The block must be placed immediately before the source code line to which it refers. The +block may be indented to line up vertically with the item it documents as appropriate. See the +[Example](#the-example) section below. + +Each line in the comment block between the `/**` and `*/` lines should start with a space followed +by an asterisk. Any text on these lines, including tag declarations, should start after a single +space after the asterisk. + +## Tag/Command names + +Use `@` to prefix doxygen commands (e.g. `@brief`, `@code`, etc.) + +## Markdown + +The doxygen tool supports a limited set of markdown format in the comment block including links, +tables, lists, etc. In some cases a trade-off may be required for readability in the source text +file versus the readability in the doxygen formatted web pages. For example, there are some +limitations on readability with '%' character and pipe character '|' within a markdown table. + +Avoid using direct HTML tags. Although doxygen supports markdown and markdown supports HTML tags, +the HTML support for doxygen's markdown is also limited. + +## The Example + +The following example covers most of the doxygen block comment and tag styles for documenting C++ +code in libcuspatial. + +```c++ +/** + * @file source_file.cpp + * @brief Description of source file contents + * + * Longer description of the source file contents. + */ + +/** + * @brief One sentence description of the class. + * + * @ingroup optional_predefined_group_id + * + * Longer, more detailed description of the class. + * + * @tparam T Short description of each template parameter + * @tparam U Short description of each template parameter + */ +template +class example_class { + + void get_my_int(); ///< Simple members can be documented like this + void set_my_int( int value ); ///< Try to use descriptive member names + + /** + * @brief Short, one sentence description of the member function. + * + * A more detailed description of what this function does and what + * its logic does. + * + * @code + * example_class inst; + * inst.set_my_int(5); + * int output = inst.complicated_function(1,dptr,fptr); + * @endcode + * + * @param[in] first This parameter is an input parameter to the function + * @param[in,out] second This parameter is used both as an input and output + * @param[out] third This parameter is an output of the function + * + * @return The result of the complex function + */ + T complicated_function(int first, double* second, float* third) + { + // Do not use doxygen-style block comments + // for code logic documentation. + } + + private: + int my_int; ///< An example private member variable +}; + +/** + * @brief Short, one sentence description of this free function. + * + * @ingroup optional_predefined_group_id + * + * A detailed description must start after a blank line. + * + * @code + * template + * struct myfunctor { + * bool operator()(T input) { return input % 2 > 0; } + * }; + * free_function(myfunctor{},12); + * @endcode + * + * @throw cuspatial::logic_error if `input_argument` is negative or zero + * + * @tparam functor_type The type of the functor + * @tparam input_type The datatype of the input argument + * + * @param[in] functor The functor to be called on the input argument + * @param[in] input_argument The input argument passed into the functor + * @return The result of calling the functor on the input argument + */ +template +bool free_function(functor_type functor, input_type input_argument) +{ + CUSPATIAL_EXPECTS( input_argument > 0, "input_argument must be positive"); + return functor(input_argument); +} + +/** + * @brief Short, one sentence description. + * + * @ingroup optional_predefined_group_id + * + * Optional, longer description. + */ +enum class example_enum { + first_enum, ///< Description of the first enum + second_enum, ///< Description of the second enum + third_enum ///< Description of the third enum +}; + +/** + * @internal + * @brief The "@internal" tag can be used to indicate rendered HTML docs for a function should not + * be generated. + */ +void internal_function() +{ + ... +} +``` + +## Descriptions + +The comment description should clearly detail how the output(s) are created from any inputs. Include +any performance and any boundary considerations. Also include any limits on parameter values and if +any default values are declared. Also, try to include a short [example](#inline-examples) if +possible. + +### @brief + +The `@brief` text should be a short, one sentence description. Doxygen does not provide much space +to show this text in the output pages. Always follow the `@brief` line with a blank comment line. + +The longer description is the rest of the comment text that is not tagged with any doxygen command. + +```c++ +/** + * @brief Short description. + * + * Long description. + * +``` + +### @copydoc + +Documentation for declarations in headers should be clear and complete. +You can use the `@copydoc` tag to avoid duplicating the comment block for a function definition. + +```c++ + /** + * @copydoc complicated_function(int,double*,float*) + * + * Any extra documentation. + */ +``` + +Also, `@copydoc` is useful when documenting a `detail` function that differs only by the `stream` parameter. + +```c++ +/** + * @copydoc cudf::segmented_count_set_bits(bitmask_type const*,std::vector const&) + * + * @param[in] stream Optional CUDA stream on which to execute kernels + */ +std::vector segmented_count_set_bits(bitmask_type const* bitmask, + std::vector const& indices, + rmm::cuda_stream_view stream = cudf::default_stream_value); +``` + +Note, you must specify the whole signature of the function, including optional parameters, so that +doxygen will be able to locate it. + +### Function parameters + +The following tags should appear near the end of function comment block in the order specified here: + +| Command | Description | +| ------- | ----------- | +| [@throw](#throw) | Specify the conditions in which the function may throw an exception | +| [@tparam](#tparam) | Description for each template parameter | +| [@param](#param) | Description for each function parameter | +| [@return](#return) | Short description of object or value returned | +| [@pre](#pre) | Specify any preconditions / requirements for the function | + +#### @throw + +Add an [@throw](http://www.doxygen.nl/manual/commands.html#cmdthrow) comment line in the doxygen +block for each exception that the function may throw. You only need to include exceptions thrown by +the function itself. If the function calls another function that may throw an exception, you do not +need to document those exceptions here. + +Include the name of the exception without backtick marks so doxygen can add reference links correctly. + +```c++ + /** + * ... + * @throw cuspatial::logic_error if `input_argument` is negative or zero + */ +``` + +Using `@throws` is also acceptable but vs-code and other tools only do syntax highlighting on +`@throw`. + +#### @tparam + +Add a [@tparam](http://www.doxygen.nl/manual/commands.html#cmdtparam) comment line for each template +parameter declared by this function. The name of the parameter specified after the doxygen tag must +match exactly to the template parameter name. + +```c++ + /** + * ... + * @tparam functor_type The type of the functor + * @tparam input_type The datatype of the input argument + */ +``` + +The definition should detail the requirements of the parameter. +For example, if the template is for a functor or predicate, then describe the expected input types +and output. + +#### @param + +Add a [@param](http://www.doxygen.nl/manual/commands.html#cmdparam) comment line for each function +parameter passed to this function. The name of the parameter specified after the doxygen tag must +match the function's parameter name. Also include append `[in]`, `[out]` or `[in,out]` to the +`@param` if it is not clear from the declaration and the parameter name itself. + +```c++ + /** + * ... + * @param[in] first This parameter is an input parameter to the function + * @param[in,out] second This parameter is used both as an input and output + * @param[out] third This parameter is an output of the function + */ +``` + +It is also recommended to vertically align the 3 columns of text if possible to make it easier to +read in a source code editor. + +#### @return + +Add a single [@return](http://www.doxygen.nl/manual/commands.html#cmdreturn) comment line at the end +of the comment block if the function returns an object or value. Include a brief description of what +is returned. + +```c++ +/** + * ... + * + * @return A new column of type INT32 and no nulls + */ +``` + +Do not include the type of the object returned with the `@return` comment. + +#### @pre + +Add a [@pre](https://www.doxygen.nl/manual/commands.html#cmdpre) comment line for each precondition +of the function. For example, assumptions about accessibility or range of iterators. + +```c++ +/** + * ... + * + * @pre All iterators must have the same `Location` type, with the same underlying floating-point + * coordinate type (e.g. `cuspatial::vec_2d`). + */ +``` + +### Inline Examples + +It is usually helpful to include a source code example inside your comment block when documenting a +function or other declaration. Use the [@code](http://www.doxygen.nl/manual/commands.html#cmdcode) +and [@endcode](http://www.doxygen.nl/manual/commands.html#cmdendcode) pair to include inline +examples. + +Doxygen supports syntax highlighting for C++ and several other programming languages (e.g. Python, +Java). By default, the `@code` tag uses syntax highlighting based on the source code in which it is +found. + +```c++ + /** + * ... + * + * @code + * auto result = rmm::device_uvector(...); + * @endcode + */ +``` + +You can specify a different language by indicating the file extension in the tag: + +```c++ + /** + * ... + * + * @code{.py} + * import cudf + * s = cudf.Series([1,2,3]) + * @endcode + */ +``` + +If you wish to use pseudocode in your example, use the following: + +```c++ + /** + * ... + * + * Sometimes pseudo-code is clearer. + * @code{.pseudo} + * s = int column of [ 1, 2, null, 4 ] + * r = fill( s, [1, 2], 0 ) + * r is now [ 1, 0, 0, 4 ] + * @endcode + */ +``` + +When writing example snippets, using fully qualified class names allows doxygen to add reference +links to the example. + +```c++ + /** + * ... + * + * @code + * auto result1 = make_column( ); // reference link will not be created + * auto result2 = cudf::make_column( ); // reference link will be created + * @endcode + */ +``` + +Although using three backtick marks `` ``` `` for example blocks will work too, they do not stand +out as well in vs-code and other source editors. + +Do not use the `@example` tag in the comments for a declaration, or doxygen will interpret the +entire source file as example source code. The source file is then published under a separate +_Examples_ page in the output. + +### Deprecations + +Add a single [@deprecated](https://www.doxygen.nl/manual/commands.html#cmddeprecated) comment line +to comment blocks for APIs that will be removed in future releases. Mention alternative / +replacement APIs in the deprecation comment. + +```c++ +/** + * ... + * + * @deprecated This function is deprecated. Use another new function instead. + */ +``` + +## Namespaces + +Doxygen output includes a _Namespaces_ page that shows all the namespaces declared with comment +blocks in the processed files. Here is an example of a doxygen description comment for a namespace +declaration. + +```c++ +/** + * @brief cuSpatial interfaces + * + * This is the top-level namespace which contains all cuSpatial functions and types. + */ +namespace cuspatial { +``` + +A description comment should be included only once for each unique namespace declaration. Otherwise, +if more than one description is found, doxygen aggregates the descriptions in an arbitrary order in +the output pages. + +If you introduce a new namespace, provide a description block for only one declaration and not for +every occurrence. + +## Groups/Modules + +Grouping declarations into modules helps users to find APIs in the doxygen pages. Generally, common +functions are already grouped logically into header files but doxygen does not automatically group +them this way in its output. + +The doxygen output includes a _Modules_ page that organizes items into groups specified using the +[Grouping doxygen commands](http://www.doxygen.nl/manual/grouping.html). These commands can group +common functions across header files, source files, and even namespaces. Groups can also be nested +by defining new groups within existing groups. + +For libcuspatial, all the group hierarchy is defined in the +[doxygen_groups.h](../include/doxygen_groups.h) header file. The `doxygen_groups.h` file does not +need to be included in any other source file, because the definitions in this file are used only by +the doxygen tool to generate groups in the _Modules_ page. Modify this file only to add or update +groups. The existing groups have been carefully structured and named, so new groups should be added +thoughtfully. + +When creating a new API, specify its group using the +[@ingroup](http://www.doxygen.nl/manual/commands.html#cmdingroup) tag and the group reference id +from the [doxygen_groups.h](../include/doxygen_groups.h) file. + +```c++ +namespace cuspatial { + +/** + * @brief ... + * + * @ingroup distance + * + * @tparam... + * @param ... + * @return ... + */ +template +OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); + +} // namespace cuspatial +``` + +You can also use the `@addtogroup` with a `@{ ... @}` pair to automatically include doxygen comment +blocks as part of a group. + +```c++ +namespace cuspatial { +/** + * @addtogroup coordinate_transform + * @{ + */ + +/** + * @brief ... + * + * @param ... + * @return ... + */ +template +OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); + +/** @} */ +} // namespace cuspatial +``` + +This just saves adding `@ingroup` to individual doxygen comment blocks within a file. Make sure a +blank line is included after the `@addtogroup` command block so doxygen knows it does not apply to +whatever follows in the source code. Note that doxygen will not assign groups to items if the +`@addtogroup` with `@{ ... @}` pair includes a namespace declaration. So include the `@addtogroup` +and `@{ ... @}` between the namespace declaration braces as shown in the example above. + +Summary of groups tags +| Tag/Command | Where to use | +| ----------- | ------------ | +| `@defgroup` | For use only in [doxygen_groups.h](../include/doxygen_groups.h) and should include the group's title. | +| `@ingroup` | Use inside individual doxygen block comments for declaration statements in a header file. | +| `@addtogroup` | Use instead of `@ingroup` for multiple declarations in the same file within a namespace declaration. Do not specify a group title. | +| `@{ ... @}` | Use only with `@addtogroup`. | + +## Build Doxygen Output + +We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux package manager +(`sudo apt install doxygen`). Alternatively you can +[build and install doxygen from source](http://www.doxygen.nl/manual/install.html). + +To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` +directory containing the `Doxyfile`. The libcuspatial documentation can also be built using +`make docs_cuspatial` from the cmake build directory (e.g. `cpp/build`). Doxygen reads and processes +all appropriate source files under the `cpp/include/` directory. The output is generated in the +`cpp/doxygen/html/` directory. You can load the local `index.html` file generated there into any web +browser to view the result. + +To view docs built on a remote server, you can run a simple HTTP server using Python: +`cd html && python -m http.server`. Then open `http://:8000` in your local web browser, +inserting the IP address of the machine on which you ran the HTTP server. + +The doxygen output is intended for building documentation only for the public APIs and classes. For +example, the output should not include documentation for `detail` or `/src` files, and these +directories are excluded in the `Doxyfile` configuration. When published by the RAPIDS build/CI +system, the doxygen output appears on the +[RAPIDS web site](https://docs.rapids.ai/api/libcuspatial/stable/index.html). From 03eeb9d3075ff29ff042bffe3c96962992ac5083 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 17:53:36 +1000 Subject: [PATCH 02/20] Add testing.md --- cpp/doc/TESTING.md | 105 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 cpp/doc/TESTING.md diff --git a/cpp/doc/TESTING.md b/cpp/doc/TESTING.md new file mode 100644 index 000000000..fcc60ecfc --- /dev/null +++ b/cpp/doc/TESTING.md @@ -0,0 +1,105 @@ +# Unit Testing in libcuspatial + +Unit tests in libcuspatial are written using +[Google Test](https://github.com/google/googletest/blob/master/docs/primer.md). + +## Best Practices: What Should We Test? + +In general we should test to make sure all code paths are covered. This is not always easy or +possible. But generally this means we test all supported combinations of algorithms and data types, +and the main iterator and container types supported by algorithms. Here are some other guidelines. + + * In general empty input is not an error in libcuspatial. Typically empty input results in empty + output. Tests should verify this. + + * Most algorithms should have one or more tests exercising inputs with a large enough number of + rows to require launching multiple thread blocks, especially when values are ultimately + communicated between blocks (e.g. reductions). This is especially important for custom kernels + but also applies to Thrust and CUB algorithm calls with lambdas / functors. + + * Test exceptional cases. For example, anything that causes the function to `throw`. + + * Test boundary cases. For example points that fall exactly on lines or boundaries. + +## Header-only and Column-based API tests + +libcuspatial currently has two C++ APIs: the column-based API uses libcudf data structures as +input and output. These tests live in `cpp/tests/` and can use libcudf features for constructing +columns and tables. The header-only API does not depend on libcudf at all and so tests of these +APIs should not include any libcudf headers. These tests currently live in `cpp/tests/experimental`. + +## Directory and File Naming + +The naming of unit test directories and source files should be consistent with the feature being +tested. For example, the tests for APIs in `point_in_polygon.hpp` should live in +`cuspatial/cpp/tests/point_in_polygon_test.cpp`. Each feature (or set of related features) should +have its own test source file named `_test.cu/cpp`. + +In the interest of improving compile time, whenever possible, test source files should be `.cpp` +files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector` +includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`, +`rmm::device_buffer` and the various `column_wrapper` types described later can be used in `.cpp` +files, and are therefore preferred in test code over `thrust::device_vector`. + +That said, testing header-only APIs requires CUDA compilation so should be done in `.cu` files. + +## Base Fixture + +All libcuspatial unit tests should make use of a GTest +["Test Fixture"](https://github.com/google/googletest/blob/master/docs/primer.md#test-fixtures-using-the-same-data-configuration-for-multiple-tests-same-data-multiple-tests). +Even if the fixture is empty, it should inherit from the base fixture `cuspatial::test::BaseFixture` +found in `cpp/tests/base_fixture.hpp`. This ensures that RMM is properly initialized and +finalized. `cuspatial::test::BaseFixture` already inherits from `testing::Test` and therefore it is +not necessary for your test fixtures to inherit from it. + +Example: + + class MyTestFixture : public cuspatial::test::BaseFixture {...}; + +## Typed Tests + +In general, libcuspatial features must work across all supported types (for cuspatial this +typically just means `float` and `double`). In order to automate the process of running +the same tests across multiple types, we use GTest's +[Typed Tests](https://github.com/google/googletest/blob/master/docs/advanced.md#typed-tests). +Typed tests allow you to write a test once and run it across a list of types. + +For example: + +```c++ +// Fixture must be a template +template +class TypedTestFixture : cuspatial::test::BaseFixture {...}; +using TestTypes = ::test::types; // Notice custom cudf type list type +TYPED_TEST_SUITE(TypedTestFixture, TestTypes); +TYPED_TEST(TypedTestFixture, FirstTest){ + // Access the current type using `TypeParam` + using T = TypeParam; +} +``` + +In this example, all tests using the `TypedTestFixture` fixture will run once for each type in the +list defined in `TestTypes` (`int, float, double`). + +## Utilities + +libcuspatial test utilities include `cuspatial::test::expect_vector_equivalent()` in +`cpp/tests/utility/vector_equality()`. This function compares two containers using Google Test's +approximate matching for floating-point values. It can handle vectors of `cuspatial::vec_2d`, +where `T` is `float` or `double`. It automatically copies data in device containers to host +containers before comparing, so you can pass it one host and one device vector, for example. + +Example: + +```c++ + auto h_expected = std::vector>{...}; // expected values + + auto d_actual = rmm::device_vector>{...}; // actual computed values + + cuspatial::test::expect_vector_equivalent(h_expected, d_actual); +``` + +Before creating your own test utilities, look to see if one already exists that does +what you need. If not, consider adding a new utility to do what you need. However, make sure that +the utility is generic enough to be useful for other tests and is not overly tailored to your +specific testing need. From f0031547bbe729ccbda7b09b1a1d70c57ddb1336 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 18:23:17 +1000 Subject: [PATCH 03/20] Simplify --- cpp/doc/TESTING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/doc/TESTING.md b/cpp/doc/TESTING.md index fcc60ecfc..4c88f63ac 100644 --- a/cpp/doc/TESTING.md +++ b/cpp/doc/TESTING.md @@ -41,7 +41,7 @@ includes device code, and so must only be used in `.cu` files. `rmm::device_uvec `rmm::device_buffer` and the various `column_wrapper` types described later can be used in `.cpp` files, and are therefore preferred in test code over `thrust::device_vector`. -That said, testing header-only APIs requires CUDA compilation so should be done in `.cu` files. +Testing header-only APIs requires CUDA compilation so should be done in `.cu` files. ## Base Fixture From 994d4016213070f55d09a843d1be827cc5b8babc Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 18:23:25 +1000 Subject: [PATCH 04/20] Add BENCHMARKING.md --- cpp/doc/BENCHMARKING.md | 55 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 cpp/doc/BENCHMARKING.md diff --git a/cpp/doc/BENCHMARKING.md b/cpp/doc/BENCHMARKING.md new file mode 100644 index 000000000..8bc159c43 --- /dev/null +++ b/cpp/doc/BENCHMARKING.md @@ -0,0 +1,55 @@ +# Unit Benchmarking in libcuspatial + +Unit benchmarks in libcuspatial are written using [NVBench](https://github.com/NVIDIA/nvbench). +While some existing benchmarks are written using +[Google Benchmark](https://github.com/google/benchmark), new benchmarks should use NVBench. + +The NVBench library is similar to Google Benchmark, but has several quality of life improvements +when doing GPU benchmarking such as displaying the fraction of peak memory bandwidth achieved and +details about the GPU hardware. + +Both NVBench and Google Benchmark provide many options for specifying ranges of parameters to +benchmark, as well as to control the time unit reported, among other options. Refer to existing +benchmarks in `cpp/benchmarks` to understand the options. + +## Directory and File Naming + +The naming of unit benchmark directories and source files should be consistent with the feature +being benchmarked. For example, the benchmarks for APIs in `point_in_polygon.hpp` should live in +`cpp/benchmarks/point_in_polygon.cu`. Each feature (or set of related features) should have its own +benchmark source file named `.cu/cpp`. + +In the interest of improving compile time, whenever possible, test source files should be `.cpp` +files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector` +includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`, +`rmm::device_buffer` and the various `column_wrapper` types described in [Testing](TESTING.md) +can be used in `.cpp` files, and are therefore preferred in test code over `thrust::device_vector`. + +Testing header-only APIs requires CUDA compilation so should be done in `.cu` files. + +## CUDA Asynchrony and benchmark accuracy + +CUDA computations and operations like copies are typically asynchronous with respect to host code, +so it is important to carefully synchronize in order to ensure the benchmark timing is not stopped +before the feature you are benchmarking has completed. An RAII helper class `cuda_event_timer` is +provided in `cpp/benchmarks/synchronization/synchronization.hpp` to help with this. This class +can also optionally clear the GPU L2 cache in order to ensure cache hits do not artificially inflate +performance in repeated iterations. + +## Data generation + +For generating benchmark input data, random data generation functions are provided in +`cpp/benchmarks/utility/random.cuh`. The input data generation happens on device. + +## What should we benchmark? + +In general, we should benchmark all features over a range of data sizes and types, so that we can +catch regressions across libcudf changes. However, running many benchmarks is expensive, so ideally +we should sample the parameter space in such a way to get good coverage without having to test +exhaustively. + +A rule of thumb is that we should benchmark with enough data to reach the point where the algorithm +reaches its saturation bottleneck, whether that bottleneck is bandwidth or computation. Using data +sets larger than this point is generally not helpful, except in specific cases where doing so +exercises different code and can therefore uncover regressions that smaller benchmarks will not +(this should be rare). From 5f7236d7daad7bd3ba5939e02e7e9ec14fcb659b Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 18:47:51 +1000 Subject: [PATCH 05/20] Move developer guide docs --- cpp/{doc => doxygen/developer_guide}/BENCHMARKING.md | 0 cpp/{doc => doxygen/developer_guide}/DEVELOPER_GUIDE.md | 0 cpp/{doc => doxygen/developer_guide}/TESTING.md | 0 .../developer_guide}/libcuspatial_refactoring_guide.md | 0 4 files changed, 0 insertions(+), 0 deletions(-) rename cpp/{doc => doxygen/developer_guide}/BENCHMARKING.md (100%) rename cpp/{doc => doxygen/developer_guide}/DEVELOPER_GUIDE.md (100%) rename cpp/{doc => doxygen/developer_guide}/TESTING.md (100%) rename cpp/{doc => doxygen/developer_guide}/libcuspatial_refactoring_guide.md (100%) diff --git a/cpp/doc/BENCHMARKING.md b/cpp/doxygen/developer_guide/BENCHMARKING.md similarity index 100% rename from cpp/doc/BENCHMARKING.md rename to cpp/doxygen/developer_guide/BENCHMARKING.md diff --git a/cpp/doc/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md similarity index 100% rename from cpp/doc/DEVELOPER_GUIDE.md rename to cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md diff --git a/cpp/doc/TESTING.md b/cpp/doxygen/developer_guide/TESTING.md similarity index 100% rename from cpp/doc/TESTING.md rename to cpp/doxygen/developer_guide/TESTING.md diff --git a/cpp/doc/libcuspatial_refactoring_guide.md b/cpp/doxygen/developer_guide/libcuspatial_refactoring_guide.md similarity index 100% rename from cpp/doc/libcuspatial_refactoring_guide.md rename to cpp/doxygen/developer_guide/libcuspatial_refactoring_guide.md From f1666e015ec92997bd3905c7afc6135116d06c1e Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 19:46:41 +1000 Subject: [PATCH 06/20] rename refactoring guide. --- .../{libcuspatial_refactoring_guide.md => REFACTORING_GUIDE.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename cpp/doxygen/developer_guide/{libcuspatial_refactoring_guide.md => REFACTORING_GUIDE.md} (100%) diff --git a/cpp/doxygen/developer_guide/libcuspatial_refactoring_guide.md b/cpp/doxygen/developer_guide/REFACTORING_GUIDE.md similarity index 100% rename from cpp/doxygen/developer_guide/libcuspatial_refactoring_guide.md rename to cpp/doxygen/developer_guide/REFACTORING_GUIDE.md From 814beef00ba576776c25f799af9305a283d9a937 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 19:47:25 +1000 Subject: [PATCH 07/20] Add layout that links to developer guide. --- cpp/doxygen/Doxyfile | 9 +- cpp/doxygen/DoxygenLayout.xml | 195 ++++++++++++++++++ .../developer_guide/DEVELOPER_GUIDE.md | 2 +- cpp/doxygen/main_page.md | 3 +- cpp/doxygen/modify_fences.sh | 9 + 5 files changed, 214 insertions(+), 4 deletions(-) create mode 100644 cpp/doxygen/DoxygenLayout.xml create mode 100755 cpp/doxygen/modify_fences.sh diff --git a/cpp/doxygen/Doxyfile b/cpp/doxygen/Doxyfile index c21d67546..65338f4a8 100644 --- a/cpp/doxygen/Doxyfile +++ b/cpp/doxygen/Doxyfile @@ -726,7 +726,7 @@ FILE_VERSION_FILTER = # DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE # tag is left empty. -LAYOUT_FILE = +LAYOUT_FILE = DoxygenLayout.xml # The CITE_BIB_FILES tag can be used to specify one or more bib files containing # the reference definitions. This must be a list of .bib files. The .bib @@ -815,6 +815,11 @@ WARN_LOGFILE = # Note: If this tag is empty the current directory is searched. INPUT = main_page.md \ + developer_guide/BENCHMARKING.md \ + developer_guide/DOCUMENTATION.md \ + developer_guide/DEVELOPER_GUIDE.md \ + developer_guide/TESTING.md \ + developer_guide/REFACTORING_GUIDE.md \ ../include # This tag can be used to specify the character encoding of the source files @@ -950,7 +955,7 @@ INPUT_FILTER = # need to set EXTENSION_MAPPING for the extension otherwise the files are not # properly processed by doxygen. -FILTER_PATTERNS = +FILTER_PATTERNS = *.md=./modify_fences.sh # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using # INPUT_FILTER) will also be used to filter the input files that are used for diff --git a/cpp/doxygen/DoxygenLayout.xml b/cpp/doxygen/DoxygenLayout.xml new file mode 100644 index 000000000..790be3755 --- /dev/null +++ b/cpp/doxygen/DoxygenLayout.xml @@ -0,0 +1,195 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index f448446a1..e34cc5389 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -1,4 +1,4 @@ -# libcuspatial C++ Developer Guide +# libcuspatial C++ Developer Guide {#DEVELOPER_GUIDE} This document serves as a guide for contributors to libcuspatial C++ code. Developers should also refer to these additional files for further documentation of libcuspatial best practices. diff --git a/cpp/doxygen/main_page.md b/cpp/doxygen/main_page.md index ea2a448e3..c303e0199 100644 --- a/cpp/doxygen/main_page.md +++ b/cpp/doxygen/main_page.md @@ -1,3 +1,4 @@ # libcuspatial -libcuspatial is a GPU-accelerated C++ library for spatial data analysis including distance and trajectory computations, spatial data indexing and spatial join operations. +libcuspatial is a GPU-accelerated C++ library for spatial data analysis including distance and +trajectory computations, spatial data indexing and spatial join operations. diff --git a/cpp/doxygen/modify_fences.sh b/cpp/doxygen/modify_fences.sh new file mode 100755 index 000000000..195f60d8a --- /dev/null +++ b/cpp/doxygen/modify_fences.sh @@ -0,0 +1,9 @@ +#!/bin/bash + +# Copyright (c) 2022, NVIDIA CORPORATION. + +# This script modifies the GitHub Markdown style code fences in our MD files +# into the PHP style that Doxygen supports, allowing us to display code +# properly both on the GitHub GUI and in published Doxygen documentation. + +sed 's/```c++/```{.cpp}/g' "$@" From 84bfaa090f3106b31c5d6153214331d18897315d Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 19:49:19 +1000 Subject: [PATCH 08/20] Remove code font from headers --- cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index e34cc5389..155833e5a 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -558,7 +558,7 @@ custom_memory_resource *mr...; rmm::device_buffer custom_buff(100, mr, stream); ``` -#### `rmm::device_scalar` +#### rmm::device_scalar Allocates a single element of the specified type initialized to the specified value. Use this for scalar input/outputs into device kernels, e.g., reduction results, null count, etc. This is effectively a convenience wrapper around a `rmm::device_vector` of length 1. @@ -576,7 +576,7 @@ kernel<<<...>>>(int_scalar.data(),...); int host_value = int_scalar.value(); ``` -#### `rmm::device_vector` +#### rmm::device_vector Allocates a specified number of elements of the specified type. If no initialization value is provided, all elements are default initialized (this incurs a kernel launch). @@ -590,7 +590,7 @@ utilities enable creation of `uvector`s from host-side vectors, or creating zero `uvector`s, so that they are as convenient to use as `device_vector`. Avoiding `device_vector` has a number of benefits, as described in the following section on `rmm::device_uvector`. -#### `rmm::device_uvector` +#### rmm::device_uvector Similar to a `device_vector`, allocates a contiguous set of elements in device memory but with key differences: @@ -632,7 +632,7 @@ group a broad set of functions, further namespaces may be used. Many functions are not meant for public use, so place them in either the `detail` or an *anonymous* namespace, depending on the situation. -#### `detail` namespace +#### detail namespace Functions or objects that will be used across *multiple* translation units (i.e., source files), should be exposed in an internal header file and placed in the `detail` namespace. Example: From dfc93ec509fe569a84ccb9d9e30c0a80c82b3536 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 19:50:32 +1000 Subject: [PATCH 09/20] Missed one. --- cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index 155833e5a..93bdf922e 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -532,7 +532,7 @@ libcuspatial code eschews raw pointers and direct memory allocation. Use RMM cla use [`device_memory_resource`](https://github.com/rapidsai/rmm/#device_memory_resource) for device memory allocation with automated lifetime management. -#### `rmm::device_buffer` +#### rmm::device_buffer Allocates a specified number of bytes of untyped, uninitialized device memory using a `device_memory_resource`. If no resource is explicitly provided, uses `rmm::mr::get_current_device_resource()`. From a1b379d7b030679d594aceaea19d0d1e8ad86ef1 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 20:19:00 +1000 Subject: [PATCH 10/20] Move documentation and escape @ commands. --- cpp/doc/DOCUMENTATION.md | 563 ------------------- cpp/doxygen/developer_guide/DOCUMENTATION.md | 530 +++++++++++++++++ 2 files changed, 530 insertions(+), 563 deletions(-) delete mode 100644 cpp/doc/DOCUMENTATION.md create mode 100644 cpp/doxygen/developer_guide/DOCUMENTATION.md diff --git a/cpp/doc/DOCUMENTATION.md b/cpp/doc/DOCUMENTATION.md deleted file mode 100644 index a4911a925..000000000 --- a/cpp/doc/DOCUMENTATION.md +++ /dev/null @@ -1,563 +0,0 @@ -# libcuspatial C++ Documentation Guide - -These guidelines apply to documenting all libcuspatial C++ source files using doxygen style -formatting although only public APIs and classes are actually -[published](https://docs.rapids.ai/api/libcuspatial/stable/index.html). - -## Copyright License - -The following is the license header comment that should appear at the beginning of every C++ -source file. - -```c++ -/* - * Copyright (c) 2022, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -``` - -The comment should start with `/*` and not `/**` so it is not processed by doxygen. - -Also, here are the rules for the copyright year. - -- A new file should have the year in which it was created -- A modified file should span the year it was created and the year it was modified (e.g. - `2019-2021`) - -Changing the copyright year may not be necessary if no content has changed (e.g. reformatting only). - -## Doxygen - -The [doxygen tool](http://www.doxygen.nl/manual/index.html) is used to generate HTML pages from the -C++ comments in the source code. Doxygen recognizes and parses block comments and performs -specialized output formatting when it encounters -[doxygen commands](http://www.doxygen.nl/manual/commands.html). - -There are almost 200 commands (also called tags in this document) that doxygen recognizes in comment -blocks. This document provides guidance on which commands/tags to use and how to use them in the -libcuspatial C++ source code. - -The doxygen process can be customized using options in the [Doxyfile](../doxygen/Doxyfile). - -Here are some of the custom options in the Doxyfile for libcuspatial. -| Option | Setting | Description | -| ------ | ------- | ----------- | -| PROJECT_NAME | libcuspatial | Title used on the main page | -| PROJECT_NUMBER | 22.10.00 | Version number | -| EXTENSION_MAPPING | cu=C++ cuh=C++ | Process `cu` and `cuh` as C++ | -| INPUT | main_page.md ../include | Embedded markdown files and source code directories to process | -| FILE_PATTERNS | *.cpp *.hpp *.h *.c *.cu *.cuh | File extensions to process | - -## Block Comments - -Use the following style for block comments describing functions, classes and other types, groups, -and files. - -```c++ -/** - * description text and - * doxygen tags go here - */ -``` - -Doxygen comment blocks start with `/**` and end with `*/` only, and with nothing else on those -lines. Do not add dashes `-----` or extra asterisks `*****` to the first and last lines of a doxygen -block. The block must be placed immediately before the source code line to which it refers. The -block may be indented to line up vertically with the item it documents as appropriate. See the -[Example](#the-example) section below. - -Each line in the comment block between the `/**` and `*/` lines should start with a space followed -by an asterisk. Any text on these lines, including tag declarations, should start after a single -space after the asterisk. - -## Tag/Command names - -Use `@` to prefix doxygen commands (e.g. `@brief`, `@code`, etc.) - -## Markdown - -The doxygen tool supports a limited set of markdown format in the comment block including links, -tables, lists, etc. In some cases a trade-off may be required for readability in the source text -file versus the readability in the doxygen formatted web pages. For example, there are some -limitations on readability with '%' character and pipe character '|' within a markdown table. - -Avoid using direct HTML tags. Although doxygen supports markdown and markdown supports HTML tags, -the HTML support for doxygen's markdown is also limited. - -## The Example - -The following example covers most of the doxygen block comment and tag styles for documenting C++ -code in libcuspatial. - -```c++ -/** - * @file source_file.cpp - * @brief Description of source file contents - * - * Longer description of the source file contents. - */ - -/** - * @brief One sentence description of the class. - * - * @ingroup optional_predefined_group_id - * - * Longer, more detailed description of the class. - * - * @tparam T Short description of each template parameter - * @tparam U Short description of each template parameter - */ -template -class example_class { - - void get_my_int(); ///< Simple members can be documented like this - void set_my_int( int value ); ///< Try to use descriptive member names - - /** - * @brief Short, one sentence description of the member function. - * - * A more detailed description of what this function does and what - * its logic does. - * - * @code - * example_class inst; - * inst.set_my_int(5); - * int output = inst.complicated_function(1,dptr,fptr); - * @endcode - * - * @param[in] first This parameter is an input parameter to the function - * @param[in,out] second This parameter is used both as an input and output - * @param[out] third This parameter is an output of the function - * - * @return The result of the complex function - */ - T complicated_function(int first, double* second, float* third) - { - // Do not use doxygen-style block comments - // for code logic documentation. - } - - private: - int my_int; ///< An example private member variable -}; - -/** - * @brief Short, one sentence description of this free function. - * - * @ingroup optional_predefined_group_id - * - * A detailed description must start after a blank line. - * - * @code - * template - * struct myfunctor { - * bool operator()(T input) { return input % 2 > 0; } - * }; - * free_function(myfunctor{},12); - * @endcode - * - * @throw cuspatial::logic_error if `input_argument` is negative or zero - * - * @tparam functor_type The type of the functor - * @tparam input_type The datatype of the input argument - * - * @param[in] functor The functor to be called on the input argument - * @param[in] input_argument The input argument passed into the functor - * @return The result of calling the functor on the input argument - */ -template -bool free_function(functor_type functor, input_type input_argument) -{ - CUSPATIAL_EXPECTS( input_argument > 0, "input_argument must be positive"); - return functor(input_argument); -} - -/** - * @brief Short, one sentence description. - * - * @ingroup optional_predefined_group_id - * - * Optional, longer description. - */ -enum class example_enum { - first_enum, ///< Description of the first enum - second_enum, ///< Description of the second enum - third_enum ///< Description of the third enum -}; - -/** - * @internal - * @brief The "@internal" tag can be used to indicate rendered HTML docs for a function should not - * be generated. - */ -void internal_function() -{ - ... -} -``` - -## Descriptions - -The comment description should clearly detail how the output(s) are created from any inputs. Include -any performance and any boundary considerations. Also include any limits on parameter values and if -any default values are declared. Also, try to include a short [example](#inline-examples) if -possible. - -### @brief - -The `@brief` text should be a short, one sentence description. Doxygen does not provide much space -to show this text in the output pages. Always follow the `@brief` line with a blank comment line. - -The longer description is the rest of the comment text that is not tagged with any doxygen command. - -```c++ -/** - * @brief Short description. - * - * Long description. - * -``` - -### @copydoc - -Documentation for declarations in headers should be clear and complete. -You can use the `@copydoc` tag to avoid duplicating the comment block for a function definition. - -```c++ - /** - * @copydoc complicated_function(int,double*,float*) - * - * Any extra documentation. - */ -``` - -Also, `@copydoc` is useful when documenting a `detail` function that differs only by the `stream` parameter. - -```c++ -/** - * @copydoc cudf::segmented_count_set_bits(bitmask_type const*,std::vector const&) - * - * @param[in] stream Optional CUDA stream on which to execute kernels - */ -std::vector segmented_count_set_bits(bitmask_type const* bitmask, - std::vector const& indices, - rmm::cuda_stream_view stream = cudf::default_stream_value); -``` - -Note, you must specify the whole signature of the function, including optional parameters, so that -doxygen will be able to locate it. - -### Function parameters - -The following tags should appear near the end of function comment block in the order specified here: - -| Command | Description | -| ------- | ----------- | -| [@throw](#throw) | Specify the conditions in which the function may throw an exception | -| [@tparam](#tparam) | Description for each template parameter | -| [@param](#param) | Description for each function parameter | -| [@return](#return) | Short description of object or value returned | -| [@pre](#pre) | Specify any preconditions / requirements for the function | - -#### @throw - -Add an [@throw](http://www.doxygen.nl/manual/commands.html#cmdthrow) comment line in the doxygen -block for each exception that the function may throw. You only need to include exceptions thrown by -the function itself. If the function calls another function that may throw an exception, you do not -need to document those exceptions here. - -Include the name of the exception without backtick marks so doxygen can add reference links correctly. - -```c++ - /** - * ... - * @throw cuspatial::logic_error if `input_argument` is negative or zero - */ -``` - -Using `@throws` is also acceptable but vs-code and other tools only do syntax highlighting on -`@throw`. - -#### @tparam - -Add a [@tparam](http://www.doxygen.nl/manual/commands.html#cmdtparam) comment line for each template -parameter declared by this function. The name of the parameter specified after the doxygen tag must -match exactly to the template parameter name. - -```c++ - /** - * ... - * @tparam functor_type The type of the functor - * @tparam input_type The datatype of the input argument - */ -``` - -The definition should detail the requirements of the parameter. -For example, if the template is for a functor or predicate, then describe the expected input types -and output. - -#### @param - -Add a [@param](http://www.doxygen.nl/manual/commands.html#cmdparam) comment line for each function -parameter passed to this function. The name of the parameter specified after the doxygen tag must -match the function's parameter name. Also include append `[in]`, `[out]` or `[in,out]` to the -`@param` if it is not clear from the declaration and the parameter name itself. - -```c++ - /** - * ... - * @param[in] first This parameter is an input parameter to the function - * @param[in,out] second This parameter is used both as an input and output - * @param[out] third This parameter is an output of the function - */ -``` - -It is also recommended to vertically align the 3 columns of text if possible to make it easier to -read in a source code editor. - -#### @return - -Add a single [@return](http://www.doxygen.nl/manual/commands.html#cmdreturn) comment line at the end -of the comment block if the function returns an object or value. Include a brief description of what -is returned. - -```c++ -/** - * ... - * - * @return A new column of type INT32 and no nulls - */ -``` - -Do not include the type of the object returned with the `@return` comment. - -#### @pre - -Add a [@pre](https://www.doxygen.nl/manual/commands.html#cmdpre) comment line for each precondition -of the function. For example, assumptions about accessibility or range of iterators. - -```c++ -/** - * ... - * - * @pre All iterators must have the same `Location` type, with the same underlying floating-point - * coordinate type (e.g. `cuspatial::vec_2d`). - */ -``` - -### Inline Examples - -It is usually helpful to include a source code example inside your comment block when documenting a -function or other declaration. Use the [@code](http://www.doxygen.nl/manual/commands.html#cmdcode) -and [@endcode](http://www.doxygen.nl/manual/commands.html#cmdendcode) pair to include inline -examples. - -Doxygen supports syntax highlighting for C++ and several other programming languages (e.g. Python, -Java). By default, the `@code` tag uses syntax highlighting based on the source code in which it is -found. - -```c++ - /** - * ... - * - * @code - * auto result = rmm::device_uvector(...); - * @endcode - */ -``` - -You can specify a different language by indicating the file extension in the tag: - -```c++ - /** - * ... - * - * @code{.py} - * import cudf - * s = cudf.Series([1,2,3]) - * @endcode - */ -``` - -If you wish to use pseudocode in your example, use the following: - -```c++ - /** - * ... - * - * Sometimes pseudo-code is clearer. - * @code{.pseudo} - * s = int column of [ 1, 2, null, 4 ] - * r = fill( s, [1, 2], 0 ) - * r is now [ 1, 0, 0, 4 ] - * @endcode - */ -``` - -When writing example snippets, using fully qualified class names allows doxygen to add reference -links to the example. - -```c++ - /** - * ... - * - * @code - * auto result1 = make_column( ); // reference link will not be created - * auto result2 = cudf::make_column( ); // reference link will be created - * @endcode - */ -``` - -Although using three backtick marks `` ``` `` for example blocks will work too, they do not stand -out as well in vs-code and other source editors. - -Do not use the `@example` tag in the comments for a declaration, or doxygen will interpret the -entire source file as example source code. The source file is then published under a separate -_Examples_ page in the output. - -### Deprecations - -Add a single [@deprecated](https://www.doxygen.nl/manual/commands.html#cmddeprecated) comment line -to comment blocks for APIs that will be removed in future releases. Mention alternative / -replacement APIs in the deprecation comment. - -```c++ -/** - * ... - * - * @deprecated This function is deprecated. Use another new function instead. - */ -``` - -## Namespaces - -Doxygen output includes a _Namespaces_ page that shows all the namespaces declared with comment -blocks in the processed files. Here is an example of a doxygen description comment for a namespace -declaration. - -```c++ -/** - * @brief cuSpatial interfaces - * - * This is the top-level namespace which contains all cuSpatial functions and types. - */ -namespace cuspatial { -``` - -A description comment should be included only once for each unique namespace declaration. Otherwise, -if more than one description is found, doxygen aggregates the descriptions in an arbitrary order in -the output pages. - -If you introduce a new namespace, provide a description block for only one declaration and not for -every occurrence. - -## Groups/Modules - -Grouping declarations into modules helps users to find APIs in the doxygen pages. Generally, common -functions are already grouped logically into header files but doxygen does not automatically group -them this way in its output. - -The doxygen output includes a _Modules_ page that organizes items into groups specified using the -[Grouping doxygen commands](http://www.doxygen.nl/manual/grouping.html). These commands can group -common functions across header files, source files, and even namespaces. Groups can also be nested -by defining new groups within existing groups. - -For libcuspatial, all the group hierarchy is defined in the -[doxygen_groups.h](../include/doxygen_groups.h) header file. The `doxygen_groups.h` file does not -need to be included in any other source file, because the definitions in this file are used only by -the doxygen tool to generate groups in the _Modules_ page. Modify this file only to add or update -groups. The existing groups have been carefully structured and named, so new groups should be added -thoughtfully. - -When creating a new API, specify its group using the -[@ingroup](http://www.doxygen.nl/manual/commands.html#cmdingroup) tag and the group reference id -from the [doxygen_groups.h](../include/doxygen_groups.h) file. - -```c++ -namespace cuspatial { - -/** - * @brief ... - * - * @ingroup distance - * - * @tparam... - * @param ... - * @return ... - */ -template -OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); - -} // namespace cuspatial -``` - -You can also use the `@addtogroup` with a `@{ ... @}` pair to automatically include doxygen comment -blocks as part of a group. - -```c++ -namespace cuspatial { -/** - * @addtogroup coordinate_transform - * @{ - */ - -/** - * @brief ... - * - * @param ... - * @return ... - */ -template -OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); - -/** @} */ -} // namespace cuspatial -``` - -This just saves adding `@ingroup` to individual doxygen comment blocks within a file. Make sure a -blank line is included after the `@addtogroup` command block so doxygen knows it does not apply to -whatever follows in the source code. Note that doxygen will not assign groups to items if the -`@addtogroup` with `@{ ... @}` pair includes a namespace declaration. So include the `@addtogroup` -and `@{ ... @}` between the namespace declaration braces as shown in the example above. - -Summary of groups tags -| Tag/Command | Where to use | -| ----------- | ------------ | -| `@defgroup` | For use only in [doxygen_groups.h](../include/doxygen_groups.h) and should include the group's title. | -| `@ingroup` | Use inside individual doxygen block comments for declaration statements in a header file. | -| `@addtogroup` | Use instead of `@ingroup` for multiple declarations in the same file within a namespace declaration. Do not specify a group title. | -| `@{ ... @}` | Use only with `@addtogroup`. | - -## Build Doxygen Output - -We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux package manager -(`sudo apt install doxygen`). Alternatively you can -[build and install doxygen from source](http://www.doxygen.nl/manual/install.html). - -To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` -directory containing the `Doxyfile`. The libcuspatial documentation can also be built using -`make docs_cuspatial` from the cmake build directory (e.g. `cpp/build`). Doxygen reads and processes -all appropriate source files under the `cpp/include/` directory. The output is generated in the -`cpp/doxygen/html/` directory. You can load the local `index.html` file generated there into any web -browser to view the result. - -To view docs built on a remote server, you can run a simple HTTP server using Python: -`cd html && python -m http.server`. Then open `http://:8000` in your local web browser, -inserting the IP address of the machine on which you ran the HTTP server. - -The doxygen output is intended for building documentation only for the public APIs and classes. For -example, the output should not include documentation for `detail` or `/src` files, and these -directories are excluded in the `Doxyfile` configuration. When published by the RAPIDS build/CI -system, the doxygen output appears on the -[RAPIDS web site](https://docs.rapids.ai/api/libcuspatial/stable/index.html). diff --git a/cpp/doxygen/developer_guide/DOCUMENTATION.md b/cpp/doxygen/developer_guide/DOCUMENTATION.md new file mode 100644 index 000000000..65a5fd044 --- /dev/null +++ b/cpp/doxygen/developer_guide/DOCUMENTATION.md @@ -0,0 +1,530 @@ +# libcuspatial C++ Documentation Guide + +These guidelines apply to documenting all libcuspatial C++ source files using doxygen style +formatting although only public APIs and classes are actually +[published](https://docs.rapids.ai/api/libcuspatial/stable/index.html). + +## Copyright License + +The following is the license header comment that should appear at the beginning of every C++ +source file. + + /* + * Copyright (c) 2022, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +The comment should start with `/*` and not `/**` so it is not processed by doxygen. + +Also, here are the rules for the copyright year. + +- A new file should have the year in which it was created +- A modified file should span the year it was created and the year it was modified (e.g. + `2019-2021`) + +Changing the copyright year may not be necessary if no content has changed (e.g. reformatting only). + +## Doxygen + +The [doxygen tool](https://www.doxygen.nl/manual/index.html) is used to generate HTML pages from the +C++ comments in the source code. Doxygen recognizes and parses block comments and performs +specialized output formatting when it encounters +[doxygen commands](https://www.doxygen.nl/manual/commands.html). + +There are almost 200 commands (also called tags in this document) that doxygen recognizes in comment +blocks. This document provides guidance on which commands/tags to use and how to use them in the +libcuspatial C++ source code. + +The doxygen process can be customized using options in the [Doxyfile](../doxygen/Doxyfile). + +Here are some of the custom options in the Doxyfile for libcuspatial. +| Option | Setting | Description | +| ------ | ------- | ----------- | +| PROJECT_NAME | libcuspatial | Title used on the main page | +| PROJECT_NUMBER | 22.10.00 | Version number | +| EXTENSION_MAPPING | cu=C++ cuh=C++ | Process `cu` and `cuh` as C++ | +| INPUT | main_page.md ../include | Embedded markdown files and source code directories to process | +| FILE_PATTERNS | *.cpp *.hpp *.h *.c *.cu *.cuh | File extensions to process | + +## Block Comments + +Use the following style for block comments describing functions, classes and other types, groups, +and files. + + /** + * description text and + * doxygen tags go here + */ + +Doxygen comment blocks start with `/**` and end with `*/` only, and with nothing else on those +lines. Do not add dashes `-----` or extra asterisks `*****` to the first and last lines of a doxygen +block. The block must be placed immediately before the source code line to which it refers. The +block may be indented to line up vertically with the item it documents as appropriate. See the +[Example](#the-example) section below. + +Each line in the comment block between the `/**` and `*/` lines should start with a space followed +by an asterisk. Any text on these lines, including tag declarations, should start after a single +space after the asterisk. + +## Tag/Command names + +Use @ to prefix doxygen commands (e.g. \@brief, \@code, etc.) + +## Markdown + +The doxygen tool supports a limited set of markdown format in the comment block including links, +tables, lists, etc. In some cases a trade-off may be required for readability in the source text +file versus the readability in the doxygen formatted web pages. For example, there are some +limitations on readability with '%' character and pipe character '|' within a markdown table. + +Avoid using direct HTML tags. Although doxygen supports markdown and markdown supports HTML tags, +the HTML support for doxygen's markdown is also limited. + +## The Example + +The following example covers most of the doxygen block comment and tag styles for documenting C++ +code in libcuspatial. + + /** + * @file source_file.cpp + * @brief Description of source file contents + * + * Longer description of the source file contents. + */ + + /** + * @brief One sentence description of the class. + * + * @ingroup optional_predefined_group_id + * + * Longer, more detailed description of the class. + * + * @tparam T Short description of each template parameter + * @tparam U Short description of each template parameter + */ + template + class example_class { + + void get_my_int(); ///< Simple members can be documented like this + void set_my_int( int value ); ///< Try to use descriptive member names + + /** + * @brief Short, one sentence description of the member function. + * + * A more detailed description of what this function does and what + * its logic does. + * + * @code + * example_class inst; + * inst.set_my_int(5); + * int output = inst.complicated_function(1,dptr,fptr); + * @endcode + * + * @param[in] first This parameter is an input parameter to the function + * @param[in,out] second This parameter is used both as an input and output + * @param[out] third This parameter is an output of the function + * + * @return The result of the complex function + */ + T complicated_function(int first, double* second, float* third) + { + // Do not use doxygen-style block comments + // for code logic documentation. + } + + private: + int my_int; ///< An example private member variable + }; + + /** + * @brief Short, one sentence description of this free function. + * + * @ingroup optional_predefined_group_id + * + * A detailed description must start after a blank line. + * + * @code + * template + * struct myfunctor { + * bool operator()(T input) { return input % 2 > 0; } + * }; + * free_function(myfunctor{},12); + * @endcode + * + * @throw cuspatial::logic_error if `input_argument` is negative or zero + * + * @tparam functor_type The type of the functor + * @tparam input_type The datatype of the input argument + * + * @param[in] functor The functor to be called on the input argument + * @param[in] input_argument The input argument passed into the functor + * @return The result of calling the functor on the input argument + */ + template + bool free_function(functor_type functor, input_type input_argument) + { + CUSPATIAL_EXPECTS( input_argument > 0, "input_argument must be positive"); + return functor(input_argument); + } + + /** + * @brief Short, one sentence description. + * + * @ingroup optional_predefined_group_id + * + * Optional, longer description. + */ + enum class example_enum { + first_enum, ///< Description of the first enum + second_enum, ///< Description of the second enum + third_enum ///< Description of the third enum + }; + + /** + * @internal + * @brief The "@internal" tag can be used to indicate rendered HTML docs for a function should not + * be generated. + */ + void internal_function() + { + ... + } + +## Descriptions + +The comment description should clearly detail how the output(s) are created from any inputs. Include +any performance and any boundary considerations. Also include any limits on parameter values and if +any default values are declared. Also, try to include a short [example](#inline-examples) if +possible. + +### \@brief + +The [\@brief](https://www.doxygen.nl/manual/commands.html#cmdbrief) text should be a short, one +sentence description. Doxygen does not provide much space to show this text in the output pages. +Always follow the \@brief line with a blank comment line. + +The longer description is the rest of the comment text that is not tagged with any doxygen command. + + /** + * @brief Short description. + * + * Long description. + * + + +### \@copydoc + +Documentation for declarations in headers should be clear and complete. +You can use the [\@copydoc](https://www.doxygen.nl/manual/commands.html#cmdcopydoc) tag to avoid +duplicating the comment block for a function definition. + + /** + * @copydoc complicated_function(int,double*,float*) + * + * Any extra documentation. + */ + + +Also, \@copydoc is useful when documenting a `detail` function that differs only by the `stream` parameter. + + /** + * @copydoc cudf::segmented_count_set_bits(bitmask_type const*,std::vector const&) + * + * @param[in] stream Optional CUDA stream on which to execute kernels + */ + std::vector segmented_count_set_bits(bitmask_type const* bitmask, + std::vector const& indices, + rmm::cuda_stream_view stream = cudf::default_stream_value); + +Note, you must specify the whole signature of the function, including optional parameters, so that +doxygen will be able to locate it. + +### Function parameters + +The following tags should appear near the end of function comment block in the order specified here: + +| Command | Description | +| ------- | ----------- | +| [\@throw](#throw) | Specify the conditions in which the function may throw an exception | +| [\@tparam](#tparam) | Description for each template parameter | +| [\@param](#param) | Description for each function parameter | +| [\@return](#return) | Short description of object or value returned | +| [\@pre](#pre) | Specify any preconditions / requirements for the function | + +#### \@throw + +Add an [\@throw](https://www.doxygen.nl/manual/commands.html#cmdthrow) comment line in the doxygen +block for each exception that the function may throw. You only need to include exceptions thrown by +the function itself. If the function calls another function that may throw an exception, you do not +need to document those exceptions here. + +Include the name of the exception without backtick marks so doxygen can add reference links correctly. + + /** + * ... + * @throw cuspatial::logic_error if `input_argument` is negative or zero + */ + +Using \@throws is also acceptable but VS code and other tools only do syntax highlighting on +\@throw. + +#### \@tparam + +Add a [\@tparam](https://www.doxygen.nl/manual/commands.html#cmdtparam) comment line for each +template parameter declared by this function. The name of the parameter specified after the doxygen +tag must match exactly to the template parameter name. + + /** + * ... + * @tparam functor_type The type of the functor + * @tparam input_type The datatype of the input argument + */ + +The definition should detail the requirements of the parameter. +For example, if the template is for a functor or predicate, then describe the expected input types +and output. + +#### \@param + +Add a [\@param](https://www.doxygen.nl/manual/commands.html#cmdparam) comment line for each function +parameter passed to this function. The name of the parameter specified after the doxygen tag must +match the function's parameter name. Also include append `[in]`, `[out]` or `[in,out]` to the +\@param if it is not clear from the declaration and the parameter name itself. + + /** + * ... + * @param[in] first This parameter is an input parameter to the function + * @param[in,out] second This parameter is used both as an input and output + * @param[out] third This parameter is an output of the function + */ + +It is also recommended to vertically align the 3 columns of text if possible to make it easier to +read in a source code editor. + +#### \@return + +Add a single [\@return](https://www.doxygen.nl/manual/commands.html#cmdreturn) comment line at the end +of the comment block if the function returns an object or value. Include a brief description of what +is returned. + + /** + * ... + * + * @return A new column of type INT32 and no nulls + */ + +Do not include the type of the object returned with the `@return` comment. + +#### \@pre + +Add a [\@pre](https://www.doxygen.nl/manual/commands.html#cmdpre) comment line for each precondition +of the function. For example, assumptions about accessibility or range of iterators. + + /** + * ... + * + * @pre All iterators must have the same `Location` type, with the same underlying floating-point + * coordinate type (e.g. `cuspatial::vec_2d`). + */ + +### Inline Examples + +It is usually helpful to include a source code example inside your comment block when documenting a +function or other declaration. Use the [\@code](https://www.doxygen.nl/manual/commands.html#cmdcode) +and [\@endcode](https://www.doxygen.nl/manual/commands.html#cmdendcode) pair to include inline +examples. + +Doxygen supports syntax highlighting for C++ and several other programming languages (e.g. Python, +Java). By default, the \@code tag uses syntax highlighting based on the source code in which it is +found. + + /** + * ... + * + * @code + * auto result = rmm::device_uvector(...); + * @endcode + */ + +You can specify a different language by indicating the file extension in the tag: + + /** + * ... + * + * @code{.py} + * import cudf + * s = cudf.Series([1,2,3]) + * @endcode + */ + +If you wish to use pseudocode in your example, use the following: + + /** + * ... + * + * Sometimes pseudo-code is clearer. + * @code{.pseudo} + * s = int column of [ 1, 2, null, 4 ] + * r = fill( s, [1, 2], 0 ) + * r is now [ 1, 0, 0, 4 ] + * @endcode + */ + +When writing example snippets, using fully qualified class names allows doxygen to add reference +links to the example. + + /** + * ... + * + * @code + * auto result1 = make_column( ); // reference link will not be created + * auto result2 = cudf::make_column( ); // reference link will be created + * @endcode + */ + +Although using three backtick marks \`\`\` for example blocks will work too, they do not stand +out as well in VS code and other source editors. + +Do not use the \@example tag in the comments for a declaration, or doxygen will interpret the +entire source file as example source code. The source file is then published under a separate +_Examples_ page in the output. + +### Deprecations + +Add a single [\@deprecated](https://www.doxygen.nl/manual/commands.html#cmddeprecated) comment line +to comment blocks for APIs that will be removed in future releases. Mention alternative / +replacement APIs in the deprecation comment. + + /** + * ... + * + * @deprecated This function is deprecated. Use another new function instead. + */ + +## Namespaces + +Doxygen output includes a _Namespaces_ page that shows all the namespaces declared with comment +blocks in the processed files. Here is an example of a doxygen description comment for a namespace +declaration. + + /** + * @brief cuSpatial interfaces + * + * This is the top-level namespace which contains all cuSpatial functions and types. + */ + namespace cuspatial { + +A description comment should be included only once for each unique namespace declaration. Otherwise, +if more than one description is found, doxygen aggregates the descriptions in an arbitrary order in +the output pages. + +If you introduce a new namespace, provide a description block for only one declaration and not for +every occurrence. + +## Groups/Modules + +Grouping declarations into modules helps users to find APIs in the doxygen pages. Generally, common +functions are already grouped logically into header files but doxygen does not automatically group +them this way in its output. + +The doxygen output includes a _Modules_ page that organizes items into groups specified using the +[Grouping doxygen commands](https://www.doxygen.nl/manual/grouping.html). These commands can group +common functions across header files, source files, and even namespaces. Groups can also be nested +by defining new groups within existing groups. + +For libcuspatial, all the group hierarchy is defined in the +[doxygen_groups.h](../include/doxygen_groups.h) header file. The `doxygen_groups.h` file does not +need to be included in any other source file, because the definitions in this file are used only by +the doxygen tool to generate groups in the _Modules_ page. Modify this file only to add or update +groups. The existing groups have been carefully structured and named, so new groups should be added +thoughtfully. + +When creating a new API, specify its group using the +[\@ingroup](https://www.doxygen.nl/manual/commands.html#cmdingroup) tag and the group reference id +from the [doxygen_groups.h](../include/doxygen_groups.h) file. + + namespace cuspatial { + + /** + * @brief ... + * + * @ingroup distance + * + * @tparam... + * @param ... + * @return ... + */ + template + OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); + + } // namespace cuspatial + +You can also use the \@addtogroup with a `@{ ... @}` pair to automatically include doxygen comment +blocks as part of a group. + + namespace cuspatial { + /** + * @addtogroup coordinate_transform + * @{ + */ + + /** + * @brief ... + * + * @param ... + * @return ... + */ + template + OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); + + /** @} */ + } // namespace cuspatial + +This just saves adding \@ingroup to individual doxygen comment blocks within a file. Make sure a +blank line is included after the \@addtogroup command block so doxygen knows it does not apply to +whatever follows in the source code. Note that doxygen will not assign groups to items if the +\@addtogroup with `@{ ... @}` pair includes a namespace declaration. So include the \@addtogroup +and `@{ ... @}` between the namespace declaration braces as shown in the example above. + +Summary of groups tags +| Tag/Command | Where to use | +| ----------- | ------------ | +| \@defgroup | For use only in [doxygen_groups.h](../include/doxygen_groups.h) and should include the group's title. | +| \@ingroup | Use inside individual doxygen block comments for declaration statements in a header file. | +| \@addtogroup | Use instead of \@ingroup for multiple declarations in the same file within a namespace declaration. Do not specify a group title. | +| `@{ ... @}` | Use only with \@addtogroup. | + +## Build Doxygen Output + +We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux package manager +(`sudo apt install doxygen`). Alternatively you can +[build and install doxygen from source](https://www.doxygen.nl/manual/install.html). + +To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` +directory containing the `Doxyfile`. The libcuspatial documentation can also be built using +`make docs_cuspatial` from the cmake build directory (e.g. `cpp/build`). Doxygen reads and processes +all appropriate source files under the `cpp/include/` directory. The output is generated in the +`cpp/doxygen/html/` directory. You can load the local `index.html` file generated there into any web +browser to view the result. + +To view docs built on a remote server, you can run a simple HTTP server using Python: +`cd html && python -m http.server`. Then open `:8000` in your local web browser, +inserting the IP address of the machine on which you ran the HTTP server. + +The doxygen output is intended for building documentation only for the public APIs and classes. For +example, the output should not include documentation for `detail` or `/src` files, and these +directories are excluded in the `Doxyfile` configuration. When published by the RAPIDS build/CI +system, the doxygen output appears on the +[RAPIDS web site](https://docs.rapids.ai/api/libcuspatial/stable/index.html). From d354e0107c4ff0a1464f60e0b2b8f40c85f8aa68 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 20:21:18 +1000 Subject: [PATCH 11/20] Remove links --- cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index 93bdf922e..b3e1b6b06 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -200,7 +200,7 @@ respectively. All memory resource parameters should be defaulted to use the retu This section provides specifics about the structure and implementation of cuSpatial API functions. -## Column-based cuSpatial API +## Column-based cuSpatial API libcuspatial's column-based API is designed to integrate seamlessly with other RAPIDS libraries, notably cuDF. To that end, this API uses `cudf::column` and `cudf::table` data structures as input @@ -321,7 +321,7 @@ auto foo = [&out0 = out0] { }; ``` -## Header-only cuSpatial API +## Header-only cuSpatial API For C++ users and developers who do not also use libcudf or other RAPIDS APIS, depending on libcudf could be a barrier to adoption of libcuspatial. libcudf is a very large library and building it From 97bcfa2fb020622e6abb2b2105750672b0d0c942 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 14 Sep 2022 20:29:00 +1000 Subject: [PATCH 12/20] Add REFACTORING_GUIDE --- .../developer_guide/DEVELOPER_GUIDE.md | 2 + .../developer_guide/REFACTORING_GUIDE.md | 100 +++++++++--------- 2 files changed, 51 insertions(+), 51 deletions(-) diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index b3e1b6b06..8986a5908 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -6,6 +6,8 @@ refer to these additional files for further documentation of libcuspatial best p * [Documentation Guide](DOCUMENTATION.md) for guidelines on documenting libcuspatial code. * [Testing Guide](TESTING.md) for guidelines on writing unit tests. * [Benchmarking Guide](BENCHMARKING.md) for guidelines on writing unit benchmarks. +* [Refactoring Guide](REFACTORING_GUIDE.md) for guidelines on refactoring legacy column-based APIs into + header-only APIs. # Overview diff --git a/cpp/doxygen/developer_guide/REFACTORING_GUIDE.md b/cpp/doxygen/developer_guide/REFACTORING_GUIDE.md index e73573da5..3a7e92b4d 100644 --- a/cpp/doxygen/developer_guide/REFACTORING_GUIDE.md +++ b/cpp/doxygen/developer_guide/REFACTORING_GUIDE.md @@ -30,7 +30,7 @@ only libcuspatial API, we can avoid problem 1 and problem 2 for users of the leg Following is an example iterator-based API for `cuspatial::haversine_distance`. (See below for discussion of API documentation.) -```C++ +```c++ template `. - * @tparam T The underlying coordinate type. Must be a floating-point type. - * - * @pre `a_lonlat_first` may equal `distance_first`, but the range `[a_lonlat_first, a_lonlat_last)` - * shall not overlap the range `[distance_first, distance_first + (a_lonlat_last - a_lonlat_last)) - * otherwise. - * @pre `b_lonlat_first` may equal `distance_first`, but the range `[b_lonlat_first, b_lonlat_last)` - * shall not overlap the range `[distance_first, distance_first + (b_lonlat_last - b_lonlat_last)) - * otherwise. - * @pre All iterators must have the same `Location` type, with the same underlying floating-point - * coordinate type (e.g. `cuspatial::vec_2d`). - * - * @return Output iterator to the element past the last distance computed. - * - * [LinkLRAI]: https://en.cppreference.com/w/cpp/named_req/RandomAccessIterator - * "LegacyRandomAccessIterator" - */ -``` + /** + * @brief Compute haversine distances between points in set A to the corresponding points in set B. + * + * Computes N haversine distances, where N is `std::distance(a_lonlat_first, a_lonlat_last)`. + * The distance for each `a_lonlat[i]` and `b_lonlat[i]` point pair is assigned to + * `distance_first[i]`. `distance_first` must be an iterator to output storage allocated for N + * distances. + * + * Computed distances will have the same units as `radius`. + * + * https://en.wikipedia.org/wiki/Haversine_formula + * + * @param[in] a_lonlat_first: beginning of range of (longitude, latitude) locations in set A + * @param[in] a_lonlat_last: end of range of (longitude, latitude) locations in set A + * @param[in] b_lonlat_first: beginning of range of (longitude, latitude) locations in set B + * @param[out] distance_first: beginning of output range of haversine distances + * @param[in] radius: radius of the sphere on which the points reside. default: 6371.0 + * (approximate radius of Earth in km) + * @param[in] stream: The CUDA stream on which to perform computations and allocate memory. + * + * @tparam LonLatItA Iterator to input location set A. Must meet the requirements of + * [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible. + * @tparam LonLatItB Iterator to input location set B. Must meet the requirements of + * [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible. + * @tparam OutputIt Output iterator. Must meet the requirements of + * [LegacyRandomAccessIterator][LinkLRAI] and be device-accessible. + * @tparam Location The `value_type` of `LonLatItA` and `LonLatItB`. Must be `cuspatial::vec_2d`. + * @tparam T The underlying coordinate type. Must be a floating-point type. + * + * @pre `a_lonlat_first` may equal `distance_first`, but the range `[a_lonlat_first, a_lonlat_last)` + * shall not overlap the range `[distance_first, distance_first + (a_lonlat_last - a_lonlat_last)) + * otherwise. + * @pre `b_lonlat_first` may equal `distance_first`, but the range `[b_lonlat_first, b_lonlat_last)` + * shall not overlap the range `[distance_first, distance_first + (b_lonlat_last - b_lonlat_last)) + * otherwise. + * @pre All iterators must have the same `Location` type, with the same underlying floating-point + * coordinate type (e.g. `cuspatial::vec_2d`). + * + * @return Output iterator to the element past the last distance computed. + * + * [LinkLRAI]: https://en.cppreference.com/w/cpp/named_req/RandomAccessIterator + * "LegacyRandomAccessIterator" + */ Key points: @@ -130,7 +128,7 @@ Key points: This is the existing API, unchanged by refactoring. Here is the existing `cuspatial::haversine_distance`: -```C++ +```c++ template ``` @@ -187,7 +185,7 @@ libcudf-based API, which requires run-time type dispatching. In the case of `hav a simple matter of a few static asserts and dynamic expectation checks, followed by a call to `thrust::transform` with a custom transform functor. -```C++ +```c++ template OutputIt haversine_distance(LonLatItA a_lonlat_first, LonLatItA a_lonlat_last, @@ -232,7 +230,7 @@ provided in `type_utils.hpp`. So, to refactor the libcudf-based API, we remove the following code. -```C++ +```c++ auto input_tuple = thrust::make_tuple(thrust::make_constant_iterator(static_cast(radius)), a_lon.begin(), a_lat.begin(), @@ -256,7 +254,7 @@ thrust::transform(rmm::exec_policy(stream), And replace it with the following code. -```C++ +```c++ auto lonlat_a = cuspatial::make_vec_2d_iterator(a_lon.begin(), a_lat.begin()); auto lonlat_b = cuspatial::make_vec_2d_iterator(b_lon.begin(), b_lat.begin()); From a22769f53df6ec1197bfaceadb1e9cda393f1578 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Tue, 20 Sep 2022 14:23:55 +1000 Subject: [PATCH 13/20] Add sed command to update tagfile versions in Doxyfile. --- ci/release/update-version.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh index c7ea93a95..69887b71c 100755 --- a/ci/release/update-version.sh +++ b/ci/release/update-version.sh @@ -45,3 +45,7 @@ sed_runner 's/'"branch-.*\/RAPIDS.cmake"'/'"branch-${NEXT_SHORT_TAG}\/RAPIDS.cma for FILE in conda/environments/*.yml; do sed_runner "s/cudf=${CURRENT_SHORT_TAG}/cudf=${NEXT_SHORT_TAG}/g" ${FILE}; done + +# Doxyfile update +sed_runner "s|\(TAGFILES.*librmm/\).*|\1${NEXT_SHORT_TAG}|" cpp/doxygen/Doxyfile +sed_runner "s|\(TAGFILES.*libcudf/\).*|\1${NEXT_SHORT_TAG}|" cpp/doxygen/Doxyfile From 66ee41fbbc36396567f0aad5229714aab5b10d15 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 21 Sep 2022 11:36:14 +1000 Subject: [PATCH 14/20] Apply suggestions from code review Co-authored-by: H. Thomson Comer --- cpp/doxygen/developer_guide/BENCHMARKING.md | 2 +- cpp/doxygen/developer_guide/DOCUMENTATION.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/cpp/doxygen/developer_guide/BENCHMARKING.md b/cpp/doxygen/developer_guide/BENCHMARKING.md index 8bc159c43..06df8eac8 100644 --- a/cpp/doxygen/developer_guide/BENCHMARKING.md +++ b/cpp/doxygen/developer_guide/BENCHMARKING.md @@ -17,7 +17,7 @@ benchmarks in `cpp/benchmarks` to understand the options. The naming of unit benchmark directories and source files should be consistent with the feature being benchmarked. For example, the benchmarks for APIs in `point_in_polygon.hpp` should live in `cpp/benchmarks/point_in_polygon.cu`. Each feature (or set of related features) should have its own -benchmark source file named `.cu/cpp`. +benchmark source file named `{.cu,cpp}`. In the interest of improving compile time, whenever possible, test source files should be `.cpp` files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector` diff --git a/cpp/doxygen/developer_guide/DOCUMENTATION.md b/cpp/doxygen/developer_guide/DOCUMENTATION.md index 65a5fd044..705131bac 100644 --- a/cpp/doxygen/developer_guide/DOCUMENTATION.md +++ b/cpp/doxygen/developer_guide/DOCUMENTATION.md @@ -221,7 +221,7 @@ The longer description is the rest of the comment text that is not tagged with a * @brief Short description. * * Long description. - * + */ ### \@copydoc From 93c1eea0fa43837eb4b5430a5fec2124984fd50b Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Tue, 27 Sep 2022 15:00:41 +1000 Subject: [PATCH 15/20] Fix update_version.sh --- ci/release/update-version.sh | 4 ++-- cpp/doxygen/Doxyfile | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh index 69887b71c..856ef3ae4 100755 --- a/ci/release/update-version.sh +++ b/ci/release/update-version.sh @@ -47,5 +47,5 @@ for FILE in conda/environments/*.yml; do done # Doxyfile update -sed_runner "s|\(TAGFILES.*librmm/\).*|\1${NEXT_SHORT_TAG}|" cpp/doxygen/Doxyfile -sed_runner "s|\(TAGFILES.*libcudf/\).*|\1${NEXT_SHORT_TAG}|" cpp/doxygen/Doxyfile +sed_runner "/PROJECT_NUMBER/ s|[0-9]\+.[0-9]\+|${NEXT_SHORT_TAG}|g" cpp/doxygen/Doxyfile +sed_runner "/TAGFILES/ s|[0-9]\+.[0-9]\+|${NEXT_SHORT_TAG}|g" cpp/doxygen/Doxyfile diff --git a/cpp/doxygen/Doxyfile b/cpp/doxygen/Doxyfile index 65338f4a8..1ac30f507 100644 --- a/cpp/doxygen/Doxyfile +++ b/cpp/doxygen/Doxyfile @@ -38,7 +38,7 @@ PROJECT_NAME = "libcuspatial" # could be handy for archiving the generated documentation or if some version # control system is used. -PROJECT_NUMBER = 22.06.00 +PROJECT_NUMBER = 22.10.00 # Using the PROJECT_BRIEF tag one can provide an optional one line description # for a project that appears at the top of each page and should give viewer a @@ -2171,7 +2171,7 @@ SKIP_FUNCTION_MACROS = YES # the path). If a tag file is not located in the directory in which doxygen is # run, you must also specify the path to the tagfile here. -TAGFILES = rmm.tag=https://docs.rapids.ai/api/librmm/22.06 "libcudf.tag=https://docs.rapids.ai/api/libcudf/22.06" +TAGFILES = rmm.tag=https://docs.rapids.ai/api/librmm/22.10 "libcudf.tag=https://docs.rapids.ai/api/libcudf/22.10" # When a file name is specified after GENERATE_TAGFILE, doxygen will create a # tag file that is based on the input files it reads. See section "Linking to From 9a4817051149421d077fe27ac67f3bc694c5e615 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Tue, 27 Sep 2022 15:02:47 +1000 Subject: [PATCH 16/20] Link to groups. --- cpp/doxygen/developer_guide/DOCUMENTATION.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/cpp/doxygen/developer_guide/DOCUMENTATION.md b/cpp/doxygen/developer_guide/DOCUMENTATION.md index 65a5fd044..5dfe275c9 100644 --- a/cpp/doxygen/developer_guide/DOCUMENTATION.md +++ b/cpp/doxygen/developer_guide/DOCUMENTATION.md @@ -506,6 +506,8 @@ Summary of groups tags | \@addtogroup | Use instead of \@ingroup for multiple declarations in the same file within a namespace declaration. Do not specify a group title. | | `@{ ... @}` | Use only with \@addtogroup. | +See [doxygen_groups.h](../include/doxygen_groups.h) for a list of existing groups. + ## Build Doxygen Output We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux package manager From 85b611ea7dca0ac173f5d2c2bf49bfad720d83aa Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Wed, 28 Sep 2022 07:58:56 +1000 Subject: [PATCH 17/20] Fix PROJECT_NUMBER update in update_version.sh --- ci/release/update-version.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh index 856ef3ae4..d19ccaaae 100755 --- a/ci/release/update-version.sh +++ b/ci/release/update-version.sh @@ -47,5 +47,5 @@ for FILE in conda/environments/*.yml; do done # Doxyfile update -sed_runner "/PROJECT_NUMBER/ s|[0-9]\+.[0-9]\+|${NEXT_SHORT_TAG}|g" cpp/doxygen/Doxyfile +sed_runner "/PROJECT_NUMBER[ ]*=/ s|=.*|= ${NEXT_FULL_TAG}|g" cpp/doxygen/Doxyfile sed_runner "/TAGFILES/ s|[0-9]\+.[0-9]\+|${NEXT_SHORT_TAG}|g" cpp/doxygen/Doxyfile From f6434b83e1869a2ede69480ac33415f58ec0ebad Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Thu, 29 Sep 2022 15:17:22 +1000 Subject: [PATCH 18/20] Copy changes made to old files, and delete old files. --- cpp/doc/BENCHMARKING.md | 51 -- cpp/doc/DOCUMENTATION.md | 575 ------------------- cpp/doc/TESTING.md | 112 ---- cpp/doxygen/developer_guide/BENCHMARKING.md | 17 +- cpp/doxygen/developer_guide/DOCUMENTATION.md | 14 +- cpp/doxygen/developer_guide/TESTING.md | 17 +- 6 files changed, 28 insertions(+), 758 deletions(-) delete mode 100644 cpp/doc/BENCHMARKING.md delete mode 100644 cpp/doc/DOCUMENTATION.md delete mode 100644 cpp/doc/TESTING.md diff --git a/cpp/doc/BENCHMARKING.md b/cpp/doc/BENCHMARKING.md deleted file mode 100644 index fdae9e98a..000000000 --- a/cpp/doc/BENCHMARKING.md +++ /dev/null @@ -1,51 +0,0 @@ -# Unit Benchmarking in libcuspatial - -Unit benchmarks in libcuspatial are written using [NVBench](https://github.com/NVIDIA/nvbench). -While some existing benchmarks are written using -[Google Benchmark](https://github.com/google/benchmark), new benchmarks should use NVBench. - -The NVBench library is similar to Google Benchmark, but has several quality of life improvements -when doing GPU benchmarking such as displaying the fraction of peak memory bandwidth achieved and -details about the GPU hardware. - -Both NVBench and Google Benchmark provide many options for specifying ranges of parameters to -benchmark, as well as to control the time unit reported, among other options. Refer to existing -benchmarks in `cpp/benchmarks` to understand the options. - -## Directory and File Naming - -The naming of unit benchmark directories and source files should be consistent with the feature -being benchmarked. For example, the benchmarks for APIs in `point_in_polygon.hpp` should live in -`cpp/benchmarks/point_in_polygon.cu`. Each feature (or set of related features) should have its own -benchmark source file named `{.cu,cpp}`. - -## CUDA Asynchrony and benchmark accuracy - -CUDA computations and operations like copies are typically asynchronous with respect to host code, -so it is important to carefully synchronize in order to ensure the benchmark timing is not stopped -before the feature you are benchmarking has completed. An RAII helper class `cuda_event_timer` is -provided in `cpp/benchmarks/synchronization/synchronization.hpp` to help with this. This class -can also optionally clear the GPU L2 cache in order to ensure cache hits do not artificially inflate -performance in repeated iterations. - -## Data generation - -For generating benchmark input data, random data generation functions are provided in -`cpp/benchmarks/utility/random.cuh`. The input data generation happens on device. - -## What should we benchmark? - -In general, we should benchmark all features over a range of data sizes and types, so that we can -catch regressions across libcudf changes. However, running many benchmarks is expensive, so ideally -we should sample the parameter space in such a way to get good coverage without having to test -exhaustively. - -A rule of thumb is that we should benchmark with enough data to reach the point where the algorithm -reaches its saturation bottleneck, whether that bottleneck is bandwidth or computation. Using data -sets larger than this point is generally not helpful, except in specific cases where doing so -exercises different code and can therefore uncover regressions that smaller benchmarks will not -(this should be rare). - -Generally we should benchmark public APIs. Benchmarking detail functions and/or internal utilities should -only be done if detecting regressions in them would be sufficiently difficult to do from public API -benchmarks. diff --git a/cpp/doc/DOCUMENTATION.md b/cpp/doc/DOCUMENTATION.md deleted file mode 100644 index a767591b8..000000000 --- a/cpp/doc/DOCUMENTATION.md +++ /dev/null @@ -1,575 +0,0 @@ -# libcuspatial C++ Documentation Guide - -These guidelines apply to documenting all libcuspatial C++ source files using doxygen style -formatting although only public APIs and classes are actually -[published](https://docs.rapids.ai/api/libcuspatial/stable/index.html). - -## Copyright License - -The following is the license header comment that should appear at the beginning of every C++ -source file. - -```c++ -/* - * Copyright (c) 2022, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -``` - -The comment should start with `/*` and not `/**` so it is not processed by doxygen. - -Also, here are the rules for the copyright year. - -- A new file should have the year in which it was created -- A modified file should span the year it was created and the year it was modified (e.g. - `2019-2021`) - -Changing the copyright year may not be necessary if no content has changed (e.g. reformatting only). - -## Doxygen - -The [doxygen tool](http://www.doxygen.nl/manual/index.html) is used to generate HTML pages from the -C++ comments in the source code. Doxygen recognizes and parses block comments and performs -specialized output formatting when it encounters -[doxygen commands](http://www.doxygen.nl/manual/commands.html). - -There are almost 200 commands (also called tags in this document) that doxygen recognizes in comment -blocks. This document provides guidance on which commands/tags to use and how to use them in the -libcuspatial C++ source code. - -Doxygen can be customized using options in the [Doxyfile](../doxygen/Doxyfile). - -Here are some of the custom options in the Doxyfile for libcuspatial. -| Option | Setting | Description | -| ------ | ------- | ----------- | -| PROJECT_NAME | libcuspatial | Title used on the main page | -| PROJECT_NUMBER | 22.10.00 | Version number | -| EXTENSION_MAPPING | cu=C++ cuh=C++ | Process `cu` and `cuh` as C++ | -| INPUT | main_page.md ../include | Embedded markdown files and source code directories to process | -| FILE_PATTERNS | *.cpp *.hpp *.h *.c *.cu *.cuh | File extensions to process | -| EXCLUDE_PATTERNS | */detail/* */nvtx/* | Wildcard pattern to exclude paths / filenames | -| TAGFILES | rmm.tag=https://docs.rapids.ai/api/librmm/22.10 "libcudf.tag=https://docs.rapids.ai/api/libcudf/22.10" | Links to external documentation tagfiles. Versions are updated automatically at each release | -| PREDEFINED | __device__= \ __host__= | Predefined macros. Helps with CUDA declaration specifiers. | - - -## Block Comments - -Use the following style for block comments describing functions, classes and other types, groups, -and files. - -```c++ -/** - * description text and - * doxygen tags go here - */ -``` - -Doxygen comment blocks start with `/**` and end with `*/` only, and with nothing else on those -lines. Do not add dashes `-----` or extra asterisks `*****` to the first and last lines of a doxygen -block. The block must be placed immediately before the source code line to which it refers. The -block may be indented to line up vertically with the item it documents as appropriate. See the -[Example](#the-example) section below. - -Each line in the comment block between the `/**` and `*/` lines should start with a space followed -by an asterisk. Any text on these lines, including tag declarations, should start after a single -space after the asterisk. - -## Tag/Command names - -Use `@` to prefix doxygen commands (e.g. `@brief`, `@code`, etc.) - -## Markdown - -The doxygen tool supports a limited set of markdown format in the comment block including links, -tables, lists, etc. In some cases a trade-off may be required for readability in the source text -file versus the readability in the doxygen formatted web pages. For example, there are some -limitations on readability with '%' character and pipe character '|' within a markdown table. - -Avoid using direct HTML tags. Although doxygen supports markdown and markdown supports HTML tags, -the HTML support for doxygen's markdown is also limited. - -## The Example - -The following example covers most of the doxygen block comment and tag styles for documenting C++ -code in libcuspatial. - -```c++ -/** - * @file source_file.cpp - * @brief Description of source file contents - * - * Longer description of the source file contents. - */ - -/** - * @brief One sentence description of the class. - * - * @ingroup optional_predefined_group_id - * - * Longer, more detailed description of the class. - * - * @tparam T Short description of each template parameter - * @tparam U Short description of each template parameter - */ -template -class example_class { - - void get_my_int(); ///< Simple members can be documented like this - void set_my_int( int value ); ///< Try to use descriptive member names - - /** - * @brief Short, one sentence description of the member function. - * - * A more detailed description of what this function does and what - * its logic does. - * - * @code - * example_class inst; - * inst.set_my_int(5); - * int output = inst.complicated_function(1,dptr,fptr); - * @endcode - * - * @param[in] first This parameter is an input parameter to the function - * @param[in,out] second This parameter is used both as an input and output - * @param[out] third This parameter is an output of the function - * - * @return The result of the complex function - */ - T complicated_function(int first, double* second, float* third) - { - // Do not use doxygen-style block comments - // for code logic documentation. - } - - private: - int my_int; ///< An example private member variable - /// Another example of private member variable - float my_float; -}; - -/** - * @brief Short, one sentence description of this free function. - * - * @ingroup optional_predefined_group_id - * - * A detailed description must start after a blank line. - * - * @code - * template - * struct myfunctor { - * bool operator()(T input) { return input % 2 > 0; } - * }; - * free_function(myfunctor{},12); - * @endcode - * - * @throw cuspatial::logic_error if `input_argument` is negative or zero - * - * @tparam functor_type The type of the functor - * @tparam input_type The datatype of the input argument - * - * @param[in] functor The functor to be called on the input argument - * @param[in] input_argument The input argument passed into the functor - * @return The result of calling the functor on the input argument - */ -template -bool free_function(functor_type functor, input_type input_argument) -{ - CUSPATIAL_EXPECTS( input_argument > 0, "input_argument must be positive"); - return functor(input_argument); -} - -/** - * @brief Short, one sentence description. - * - * @ingroup optional_predefined_group_id - * - * Optional, longer description. - */ -enum class example_enum { - first_enum, ///< Description of the first enum - second_enum, ///< Description of the second enum - third_enum ///< Description of the third enum -}; - -/** - * @internal - * @brief The "@internal" tag can be used to indicate rendered HTML docs for a function should not - * be generated. - */ -void internal_function() -{ - ... -} -``` - -## Descriptions - -The comment description should clearly detail how the output(s) are created from any inputs. Include -any performance and any boundary considerations. Also include any limits on parameter values and if -any default values are declared. Also, try to include a short [example](#inline-examples) if -possible. - -### @brief - -The `@brief` text should be a short, one sentence description. Doxygen does not provide much space -to show this text in the output pages. Always follow the `@brief` line with a blank comment line. - -The longer description is the rest of the comment text that is not tagged with any doxygen command. - -```c++ -/** - * @brief Short description. - * - * Long description. - * -``` - -### @copydoc - -Documentation for declarations in headers should be clear and complete. -You can use the `@copydoc` tag to avoid duplicating the comment block for a function definition. - -```c++ - /** - * @copydoc complicated_function(int,double*,float*) - * - * Any extra documentation. - */ -``` - -Also, `@copydoc` is useful when documenting a `detail` function that differs only by the `stream` parameter. - -```c++ -/** - * @copydoc cudf::segmented_count_set_bits(bitmask_type const*,std::vector const&) - * - * @param[in] stream Optional CUDA stream on which to execute kernels - */ -std::vector segmented_count_set_bits(bitmask_type const* bitmask, - std::vector const& indices, - rmm::cuda_stream_view stream = cudf::default_stream_value); -``` - -Note, you must specify the whole signature of the function, including optional parameters, so that -doxygen will be able to locate it. - -### Function parameters - -The following tags should appear near the end of function comment block in the order specified here: - -| Command | Description | -| ------- | ----------- | -| [@throw](#throw) | Specify the conditions in which the function may throw an exception | -| [@tparam](#tparam) | Description for each template parameter | -| [@param](#param) | Description for each function parameter | -| [@return](#return) | Short description of object or value returned | -| [@pre](#pre) | Specify any preconditions / requirements for the function | - -#### @throw - -Add an [@throw](http://www.doxygen.nl/manual/commands.html#cmdthrow) comment line in the doxygen -block for each exception that the function may throw. You only need to include exceptions thrown by -the function itself. If the function calls another function that may throw an exception, you do not -need to document those exceptions here. - -Include the name of the exception without backtick marks so doxygen can add reference links correctly. - -```c++ - /** - * ... - * @throw cuspatial::logic_error if `input_argument` is negative or zero - */ -``` - -Using `@throws` is also acceptable but vs-code and other tools only do syntax highlighting on -`@throw`. - -#### @tparam - -Add a [@tparam](http://www.doxygen.nl/manual/commands.html#cmdtparam) comment line for each template -parameter declared by this function. The name of the parameter specified after the doxygen tag must -match exactly to the template parameter name. - -```c++ - /** - * ... - * @tparam functor_type The type of the functor - * @tparam input_type The datatype of the input argument - */ -``` - -The definition should detail the requirements of the parameter. -For example, if the template is for a functor or predicate, then describe the expected input types -and output. - -#### @param - -Add a [@param](http://www.doxygen.nl/manual/commands.html#cmdparam) comment line for each function -parameter passed to this function. The name of the parameter specified after the doxygen tag must -match the function's parameter name. Optionally, you may append `[in]`, `[out]` or `[in,out]` to the -`@param` if it is not clear from the declaration and the parameter name whether the paremeter is an -input parameter or an output parameter. This is especially helpful for the header-only API where -it may be hard to distinguish input and output iterators. - -```c++ - /** - * ... - * @param[in] first This parameter is an input parameter to the function - * @param[in,out] second This parameter is used both as an input and output - * @param[out] third This parameter is an output of the function - */ -``` - -It is also recommended to vertically align the 3 columns of text if possible to make it easier to -read in a source code editor. - -#### @return - -Add a single [@return](http://www.doxygen.nl/manual/commands.html#cmdreturn) comment line at the end -of the comment block if the function returns an object or value. Include a brief description of what -is returned. - -```c++ -/** - * ... - * - * @return A new column of type INT32 and no nulls - */ -``` - -Do not include the type of the object returned with the `@return` comment. - -#### @pre - -Add a [@pre](https://www.doxygen.nl/manual/commands.html#cmdpre) comment line for each precondition -of the function. For example, assumptions about device accessibility or range of iterators. One -common example is a precondition that input and output iterator ranges do not partially overlap. -Often these are requirements that it is impossible or expensive to enforce at run time. Violations -of preconditions often result in undefined behavior rather than exceptions. - -```c++ -/** - * ... - * - * @pre All iterators must have the same `Location` type, with the same underlying floating-point - * coordinate type (e.g. `cuspatial::vec_2d`). - */ -``` - -### Inline Examples - -It is usually helpful to include a source code example inside your comment block when documenting a -function or other declaration. Use the [@code](http://www.doxygen.nl/manual/commands.html#cmdcode) -and [@endcode](http://www.doxygen.nl/manual/commands.html#cmdendcode) pair to include inline -examples. - -Doxygen supports syntax highlighting for C++ and several other programming languages (e.g. Python, -Java). By default, the `@code` tag uses syntax highlighting based on the source code in which it is -found. - -```c++ - /** - * ... - * - * @code - * auto result = rmm::device_uvector(...); - * @endcode - */ -``` - -You can specify a different language by indicating the file extension in the tag: - -```c++ - /** - * ... - * - * @code{.py} - * import cudf - * s = cudf.Series([1,2,3]) - * @endcode - */ -``` - -If you wish to use pseudocode in your example, use the following: - -```c++ - /** - * ... - * - * Sometimes pseudo-code is clearer. - * @code{.pseudo} - * s = int column of [ 1, 2, null, 4 ] - * r = fill( s, [1, 2], 0 ) - * r is now [ 1, 0, 0, 4 ] - * @endcode - */ -``` - -When writing example snippets, using fully qualified class names allows doxygen to add reference -links to the example. - -```c++ - /** - * ... - * - * @code - * auto result1 = make_column( ); // reference link will not be created - * auto result2 = cudf::make_column( ); // reference link will be created - * @endcode - */ -``` - -Although using three backtick marks `` ``` `` for example blocks will work too, they do not stand -out as well in vs-code and other source editors. - -Do not use the `@example` tag in the comments for a declaration, or doxygen will interpret the -entire source file as example source code. The source file is then published under a separate -_Examples_ page in the output. - -### Deprecations - -Add a single [@deprecated](https://www.doxygen.nl/manual/commands.html#cmddeprecated) comment line -to comment blocks for APIs that will be removed in future releases. Mention alternative / -replacement APIs in the deprecation comment. - -```c++ -/** - * ... - * - * @deprecated This function is deprecated. Use another new function instead. - */ -``` - -## Namespaces - -Doxygen output includes a _Namespaces_ page that shows all the namespaces declared with comment -blocks in the processed files. Here is an example of a doxygen description comment for a namespace -declaration. - -```c++ -/** - * @brief cuSpatial interfaces - * - * This is the top-level namespace which contains all cuSpatial functions and types. - */ -namespace cuspatial { -``` - -A description comment should be included only once for each unique namespace declaration. Otherwise, -if more than one description is found, doxygen aggregates the descriptions in an arbitrary order in -the output pages. - -If you introduce a new namespace, provide a description block for only one declaration and not for -every occurrence. - -## Groups/Modules - -Grouping declarations into modules helps users to find APIs in the doxygen pages. Generally, common -functions are already grouped logically into header files but doxygen does not automatically group -them this way in its output. - -The doxygen output includes a _Modules_ page that organizes items into groups specified using the -[Grouping doxygen commands](http://www.doxygen.nl/manual/grouping.html). These commands can group -common functions across header files, source files, and even namespaces. Groups can also be nested -by defining new groups within existing groups. - -For libcuspatial, all the group hierarchy is defined in the -[doxygen_groups.h](../include/doxygen_groups.h) header file. The `doxygen_groups.h` file does not -need to be included in any other source file, because the definitions in this file are used only by -the doxygen tool to generate groups in the _Modules_ page. Modify this file only to add or update -groups. The existing groups have been carefully structured and named, so new groups should be added -thoughtfully. - -When creating a new API, specify its group using the -[@ingroup](http://www.doxygen.nl/manual/commands.html#cmdingroup) tag and the group reference id -from the [doxygen_groups.h](../include/doxygen_groups.h) file. - -```c++ -namespace cuspatial { - -/** - * @brief ... - * - * @ingroup distance - * - * @tparam... - * @param ... - * @return ... - */ -template -OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); - -} // namespace cuspatial -``` - -You can also use the `@addtogroup` with a `@{ ... @}` pair to automatically include doxygen comment -blocks as part of a group. - -```c++ -namespace cuspatial { -/** - * @addtogroup coordinate_transform - * @{ - */ - -/** - * @brief ... - * - * @param ... - * @return ... - */ -template -OutputIt pairwise_point_distance(Cart2dItA points1_first, ...); - -/** @} */ -} // namespace cuspatial -``` - -This just saves adding `@ingroup` to individual doxygen comment blocks within a file. Make sure a -blank line is included after the `@addtogroup` command block so doxygen knows it does not apply to -whatever follows in the source code. Note that doxygen will not assign groups to items if the -`@addtogroup` with `@{ ... @}` pair includes a namespace declaration. So include the `@addtogroup` -and `@{ ... @}` between the namespace declaration braces as shown in the example above. - -Summary of groups tags -| Tag/Command | Where to use | -| ----------- | ------------ | -| `@defgroup` | For use only in [doxygen_groups.h](../include/doxygen_groups.h) and should include the group's title. | -| `@ingroup` | Use inside individual doxygen block comments for declaration statements in a header file. | -| `@addtogroup` | Use instead of `@ingroup` for multiple declarations in the same file within a namespace declaration. Do not specify a group title. | -| `@{ ... @}` | Use only with `@addtogroup`. | - -## Build Doxygen Output - -We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux package manager -(`sudo apt install doxygen`). Alternatively you can -[build and install doxygen from source](http://www.doxygen.nl/manual/install.html). - -To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` -directory containing the `Doxyfile`. The libcudf documentation can also be built using -`cmake --build . --target docs_cudf` from the cmake build directory (e.g. `cpp/build/release`). - -Doxygen reads and processes all appropriate source files under the `cpp/include/` directory. The -output is generated in the `cpp/doxygen/html/` directory. You can load the local `index.html` file -generated there into any web browser to view the result. - -To view docs built on a remote server, you can run a simple HTTP server using Python: -`cd html && python -m http.server`. Then open `http://:8000` in your local web browser, -inserting the IP address of the machine on which you ran the HTTP server. - -The doxygen output is intended for building documentation only for the public APIs and classes. For -example, the output should not include documentation for `detail` or `/src` files, and these -directories are excluded in the `Doxyfile` configuration. When published by the RAPIDS build/CI -system, the doxygen output appears on the -[RAPIDS web site](https://docs.rapids.ai/api/libcuspatial/stable/index.html). diff --git a/cpp/doc/TESTING.md b/cpp/doc/TESTING.md deleted file mode 100644 index 2cb7f7839..000000000 --- a/cpp/doc/TESTING.md +++ /dev/null @@ -1,112 +0,0 @@ -# Unit Testing in libcuspatial - -Unit tests in libcuspatial are written using -[Google Test](https://github.com/google/googletest/blob/master/docs/primer.md). - -## Best Practices: What Should We Test? - -In general we should test to make sure all code paths are covered. This is not always easy or -possible. But generally this means we test all supported combinations of algorithms and data types, -and the main iterator and container types supported by algorithms. Here are some other guidelines. - - * Test public APIs. Try to ensure that public API tests result in 100% coverage of libcuspatial - code (including internal details and utilities). - - * Test exceptional cases. For example, anything that causes the function to `throw`. - - * Test boundary cases. For example points that fall exactly on lines or boundaries. - - * In general empty input is not an error in libcuspatial. Typically empty input results in empty - output. Tests should verify this. - - * Most algorithms should have one or more tests exercising inputs with a large enough number of - rows to require launching multiple thread blocks, especially when values are ultimately - communicated between blocks (e.g. reductions). This is especially important for custom kernels - but also applies to Thrust and CUB algorithm calls with lambdas / functors. - -## Header-only and Column-based API tests - -libcuspatial currently has two C++ APIs: the column-based API uses libcudf data structures as -input and output. These tests live in `cpp/tests/` and can use libcudf features for constructing -columns and tables. The header-only API does not depend on libcudf at all and so tests of these -APIs should not include any libcudf headers. These tests currently live in `cpp/tests/experimental`. - -Generally we test algorithms and business logic in the header-only API's unit tests. Column-based -API tests should only cover specifics of the column-based API, such as type handling, -input validation, and exceptions that are only thrown by that API. - -## Directory and File Naming - -The naming of unit test directories and source files should be consistent with the feature being -tested. For example, the tests for APIs in `point_in_polygon.hpp` should live in -`cuspatial/cpp/tests/point_in_polygon_test.cpp`. Each feature (or set of related features) should -have its own test source file named `_test.cu/cpp`. - -In the interest of improving compile time, whenever possible, test source files should be `.cpp` -files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector` -includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`, -`rmm::device_buffer` and the various `column_wrapper` types described later can be used in `.cpp` -files, and are therefore preferred in test code over `thrust::device_vector`. - -Testing header-only APIs requires CUDA compilation so should be done in `.cu` files. - -## Base Fixture - -All libcuspatial unit tests should make use of a GTest -["Test Fixture"](https://github.com/google/googletest/blob/master/docs/primer.md#test-fixtures-using-the-same-data-configuration-for-multiple-tests-same-data-multiple-tests). -Even if the fixture is empty, it should inherit from the base fixture `cuspatial::test::BaseFixture` -found in `cpp/tests/base_fixture.hpp`. This ensures that RMM is properly initialized and -finalized. `cuspatial::test::BaseFixture` already inherits from `testing::Test` and therefore it is -not necessary for your test fixtures to inherit from it. - -Example: - - class MyTestFixture : public cuspatial::test::BaseFixture {...}; - -## Typed Tests - -In general, libcuspatial features must work across all supported types (for cuspatial this -typically just means `float` and `double`). In order to automate the process of running -the same tests across multiple types, we use GTest's -[Typed Tests](https://github.com/google/googletest/blob/master/docs/advanced.md#typed-tests). -Typed tests allow you to write a test once and run it across a list of types. - -For example: - -```c++ -// Fixture must be a template -template -class TypedTestFixture : cuspatial::test::BaseFixture {...}; -using TestTypes = ::test::types; // Notice custom cudf type list type -TYPED_TEST_SUITE(TypedTestFixture, TestTypes); -TYPED_TEST(TypedTestFixture, FirstTest){ - // Access the current type using `TypeParam` - using T = TypeParam; -} -``` - -In this example, all tests using the `TypedTestFixture` fixture will run once for each type in the -list defined in `TestTypes` (`float, double`). - -## Utilities - -libcuspatial test utilities include `cuspatial::test::expect_vector_equivalent()` in -`cpp/tests/utility/vector_equality()`. This function compares two containers using Google Test's -approximate matching for floating-point values. It can handle vectors of `cuspatial::vec_2d`, -where `T` is `float` or `double`. It automatically copies data in device containers to host -containers before comparing, so you can pass it one host and one device vector, for example. - -Example: - -```c++ - auto h_expected = std::vector>{...}; // expected values - - auto d_actual = rmm::device_vector>{...}; // actual computed values - - cuspatial::test::expect_vector_equivalent(h_expected, d_actual); -``` - -Before creating your own test utilities, look to see if one already exists that does -what you need. If not, consider adding a new utility to do what you need. However, make sure that -the utility is generic enough to be useful for other tests and is not overly tailored to your -specific testing need. diff --git a/cpp/doxygen/developer_guide/BENCHMARKING.md b/cpp/doxygen/developer_guide/BENCHMARKING.md index 06df8eac8..7df628973 100644 --- a/cpp/doxygen/developer_guide/BENCHMARKING.md +++ b/cpp/doxygen/developer_guide/BENCHMARKING.md @@ -19,22 +19,14 @@ being benchmarked. For example, the benchmarks for APIs in `point_in_polygon.hpp `cpp/benchmarks/point_in_polygon.cu`. Each feature (or set of related features) should have its own benchmark source file named `{.cu,cpp}`. -In the interest of improving compile time, whenever possible, test source files should be `.cpp` -files because `nvcc` is slower than `gcc` in compiling host code. Note that `thrust::device_vector` -includes device code, and so must only be used in `.cu` files. `rmm::device_uvector`, -`rmm::device_buffer` and the various `column_wrapper` types described in [Testing](TESTING.md) -can be used in `.cpp` files, and are therefore preferred in test code over `thrust::device_vector`. - -Testing header-only APIs requires CUDA compilation so should be done in `.cu` files. - ## CUDA Asynchrony and benchmark accuracy CUDA computations and operations like copies are typically asynchronous with respect to host code, so it is important to carefully synchronize in order to ensure the benchmark timing is not stopped before the feature you are benchmarking has completed. An RAII helper class `cuda_event_timer` is provided in `cpp/benchmarks/synchronization/synchronization.hpp` to help with this. This class -can also optionally clear the GPU L2 cache in order to ensure cache hits do not artificially inflate -performance in repeated iterations. +can also optionally clear the GPU L2 cache in order to ensure cache hits do not artificially +inflate performance in repeated iterations. ## Data generation @@ -53,3 +45,8 @@ reaches its saturation bottleneck, whether that bottleneck is bandwidth or compu sets larger than this point is generally not helpful, except in specific cases where doing so exercises different code and can therefore uncover regressions that smaller benchmarks will not (this should be rare). + + +Generally we should benchmark public APIs. Benchmarking detail functions and/or internal utilities +should only be done if detecting regressions in them would be sufficiently difficult to do from +public API benchmarks. diff --git a/cpp/doxygen/developer_guide/DOCUMENTATION.md b/cpp/doxygen/developer_guide/DOCUMENTATION.md index b7ffca170..d6dc973df 100644 --- a/cpp/doxygen/developer_guide/DOCUMENTATION.md +++ b/cpp/doxygen/developer_guide/DOCUMENTATION.md @@ -57,6 +57,10 @@ Here are some of the custom options in the Doxyfile for libcuspatial. | EXTENSION_MAPPING | cu=C++ cuh=C++ | Process `cu` and `cuh` as C++ | | INPUT | main_page.md ../include | Embedded markdown files and source code directories to process | | FILE_PATTERNS | *.cpp *.hpp *.h *.c *.cu *.cuh | File extensions to process | +| EXCLUDE_PATTERNS | */detail/* */nvtx/* | Wildcard pattern to exclude paths / filenames | +| TAGFILES | rmm.tag=https://docs.rapids.ai/api/librmm/22.10 "libcudf.tag=https://docs.rapids.ai/api/libcudf/22.10" | Links to external documentation tagfiles. Versions are updated automatically at each release | +| PREDEFINED | __device__= \ __host__= | Predefined macros. Helps with CUDA declaration specifiers. | + ## Block Comments @@ -516,13 +520,13 @@ We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` directory containing the `Doxyfile`. The libcuspatial documentation can also be built using -`make docs_cuspatial` from the cmake build directory (e.g. `cpp/build`). Doxygen reads and processes -all appropriate source files under the `cpp/include/` directory. The output is generated in the -`cpp/doxygen/html/` directory. You can load the local `index.html` file generated there into any web -browser to view the result. +`cmake --build . --target docs_cudf` from the cmake build directory (e.g. `cpp/build/release`). + +Doxygen reads and processes all appropriate source files under the `cpp/include/` directory. The output is generated in the `cpp/doxygen/html/` directory. You can load the local +`index.html` file generated there into any web browser to view the result. To view docs built on a remote server, you can run a simple HTTP server using Python: -`cd html && python -m http.server`. Then open `:8000` in your local web browser, +`cd html && python -m http.server`. Then open `http://:8000` in your local web browser, inserting the IP address of the machine on which you ran the HTTP server. The doxygen output is intended for building documentation only for the public APIs and classes. For diff --git a/cpp/doxygen/developer_guide/TESTING.md b/cpp/doxygen/developer_guide/TESTING.md index 4c88f63ac..0e75e9d83 100644 --- a/cpp/doxygen/developer_guide/TESTING.md +++ b/cpp/doxygen/developer_guide/TESTING.md @@ -9,6 +9,13 @@ In general we should test to make sure all code paths are covered. This is not a possible. But generally this means we test all supported combinations of algorithms and data types, and the main iterator and container types supported by algorithms. Here are some other guidelines. + * Test public APIs. Try to ensure that public API tests result in 100% coverage of libcuspatial + code (including internal details and utilities). + + * Test exceptional cases. For example, anything that causes the function to `throw`. + + * Test boundary cases. For example points that fall exactly on lines or boundaries. + * In general empty input is not an error in libcuspatial. Typically empty input results in empty output. Tests should verify this. @@ -17,10 +24,6 @@ and the main iterator and container types supported by algorithms. Here are som communicated between blocks (e.g. reductions). This is especially important for custom kernels but also applies to Thrust and CUB algorithm calls with lambdas / functors. - * Test exceptional cases. For example, anything that causes the function to `throw`. - - * Test boundary cases. For example points that fall exactly on lines or boundaries. - ## Header-only and Column-based API tests libcuspatial currently has two C++ APIs: the column-based API uses libcudf data structures as @@ -28,6 +31,10 @@ input and output. These tests live in `cpp/tests/` and can use libcudf features columns and tables. The header-only API does not depend on libcudf at all and so tests of these APIs should not include any libcudf headers. These tests currently live in `cpp/tests/experimental`. +Generally, we test algorithms and business logic in the header-only API's unit tests. +Column-based API tests should only cover specifics of the column-based API, such as type +handling, input validation, and exceptions that are only thrown by that API. + ## Directory and File Naming The naming of unit test directories and source files should be consistent with the feature being @@ -79,7 +86,7 @@ TYPED_TEST(TypedTestFixture, FirstTest){ ``` In this example, all tests using the `TypedTestFixture` fixture will run once for each type in the -list defined in `TestTypes` (`int, float, double`). +list defined in `TestTypes` (`float, double`). ## Utilities From cace8e9c1313ea9350e3ae97b86f448262b244f0 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Thu, 29 Sep 2022 15:24:12 +1000 Subject: [PATCH 19/20] cudf --> cuspatial --- cpp/doxygen/developer_guide/DOCUMENTATION.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cpp/doxygen/developer_guide/DOCUMENTATION.md b/cpp/doxygen/developer_guide/DOCUMENTATION.md index d6dc973df..9271634d4 100644 --- a/cpp/doxygen/developer_guide/DOCUMENTATION.md +++ b/cpp/doxygen/developer_guide/DOCUMENTATION.md @@ -520,7 +520,8 @@ We recommend installing Doxygen using conda (`conda install doxygen`) or a Linux To build the libcuspatial HTML documentation simply run the `doxygen` command from the `cpp/doxygen` directory containing the `Doxyfile`. The libcuspatial documentation can also be built using -`cmake --build . --target docs_cudf` from the cmake build directory (e.g. `cpp/build/release`). +`cmake --build . --target docs_cuspatial` from the cmake build directory (e.g. +`cpp/build/release`). Doxygen reads and processes all appropriate source files under the `cpp/include/` directory. The output is generated in the `cpp/doxygen/html/` directory. You can load the local `index.html` file generated there into any web browser to view the result. From e8a891249608fc3bd03f2f1c9c6532f18a7d3283 Mon Sep 17 00:00:00 2001 From: Mark Harris Date: Thu, 29 Sep 2022 15:42:30 +1000 Subject: [PATCH 20/20] Improve main_page.md with description and links. --- cpp/doxygen/main_page.md | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/cpp/doxygen/main_page.md b/cpp/doxygen/main_page.md index c303e0199..605585668 100644 --- a/cpp/doxygen/main_page.md +++ b/cpp/doxygen/main_page.md @@ -1,4 +1,21 @@ -# libcuspatial - libcuspatial is a GPU-accelerated C++ library for spatial data analysis including distance and -trajectory computations, spatial data indexing and spatial join operations. +trajectory computations, spatial data indexing and spatial join operations. libcuspatial is +the high-performance backend for the cuSpatial Python library. + +libcuspatial has two interfaces. The generic header-only C++ API represents data as arrays +of structures (e.g. 2D points). The header-only API uses iterators for input and output, and is +similar in style to the C++ Standard Template Library (STL) and Thrust. All cuSpatial algorithms +are implemented in this API. + +The libcuspatial "column-based API" is a C++ API based on data types from libcudf, +[the CUDA Dataframe library C++ API](https://docs.rapids.ai/api/libcudf/nightly/index.html). The +column-based API represents spatial data as cuDF tables of type-erased columns, and layers on top +of the header-only API. + +## Useful Links + + - [cuSpatial Github Repository](https://github.com/rapidsai/cuspatial) + - [cuSpatial C++ Developer Guide](DEVELOPER_GUIDE.html) + - [cuSpatial Python API Documentation](https://docs.rapids.ai/api/cuspatial/stable/) + - [cuSpatial Python Developer Guide](https://docs.rapids.ai/api/cuspatial/stable/developer_guide/index.html)] + - [RAPIDS Home Page](https://rapids.ai)