Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] [3/5] Header structure: force explicit instantiation in tests and benchmarks #1439

Closed
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
d9801e8
MV: add -inl suffix to header paths
ahendriksen Apr 20, 2023
50b374d
MV: raft_runtime src files
ahendriksen Apr 13, 2023
8974ae3
FIX: add missing includes
ahendriksen Apr 20, 2023
95ef31b
FIX: getWorkspaceSize
ahendriksen Apr 20, 2023
7edcb6a
PREP: Separate rbf_fin_op
ahendriksen Apr 20, 2023
71de7bd
PREP: registers: Add _types header
ahendriksen Apr 20, 2023
541cabc
Change RAFT_COMPILED from INTERFACE to PUBLIC
ahendriksen Apr 20, 2023
d81b14e
Define RAFT_EXPLICIT and RAFT_EXPLICIT_INSTANTIATE_ONLY
ahendriksen Apr 20, 2023
48ea769
Update docs
ahendriksen Apr 20, 2023
e6bb5d5
Replace specializations by split headers
ahendriksen Apr 20, 2023
ff79abf
Deprecate specialization headers
ahendriksen Apr 20, 2023
c9e7413
Add interleaved scan instances
ahendriksen Apr 20, 2023
0c889dc
Separate fused_l2_nn_helpers
ahendriksen Apr 20, 2023
f97b2a8
Remove pairwise_matrix_instantiation_point
ahendriksen Apr 20, 2023
fb637f7
Rename specialization => instantiation
ahendriksen Apr 20, 2023
7b065af
test/neighbors/selection.cu: Expose kFaissMaxK
ahendriksen Apr 20, 2023
d5b5673
Remove includes of specialization headers
ahendriksen Apr 20, 2023
94d8117
test/distance/dist_adj.cu: Add instance
ahendriksen Apr 20, 2023
361570b
test/cluster/linkage.cu: Allow instance
ahendriksen Apr 20, 2023
5171de3
test/sparse/neighbors/connect_components.cu: Allow instance
ahendriksen Apr 20, 2023
4426c50
test/neighbors/ann_ivf_pq/test_float_uint32_t.cu: Allow instance
ahendriksen Apr 20, 2023
e527efc
test/matrix/select_k.cu: Change index type
ahendriksen Apr 20, 2023
17902e9
test/neighbors/fused_l2_knn.cu: Change index type
ahendriksen Apr 20, 2023
976189b
Force explicit instantiations in tests
ahendriksen Apr 20, 2023
bdae61d
Force explicit instantiations in benchmarks
ahendriksen Apr 20, 2023
b0b8fe5
Test that headers are free standing
ahendriksen Apr 20, 2023
4b9700e
Update cpp/test/matrix/select_k.cu
ahendriksen Apr 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ RAFT itself can be installed through conda, [CMake Package Manager (CPM)](https:

The easiest way to install RAFT is through conda and several packages are provided.
- `libraft-headers` RAFT headers
- `libraft` (optional) shared library of pre-compiled template specializations and runtime APIs.
- `libraft` (optional) shared library of pre-compiled template instantiations and runtime APIs.
- `pylibraft` (optional) Python wrappers around RAFT algorithms and primitives.
- `raft-dask` (optional) enables deployment of multi-node multi-GPU algorithms that use RAFT `raft::comms` in Dask clusters.

Expand Down Expand Up @@ -231,11 +231,11 @@ You can find an [example RAFT](cpp/template/README.md) project template in the `

Additional CMake targets can be made available by adding components in the table below to the `RAFT_COMPONENTS` list above, separated by spaces. The `raft::raft` target will always be available. RAFT headers require, at a minimum, the CUDA toolkit libraries and RMM dependencies.

| Component | Target | Description | Base Dependencies |
|-------------|---------------------|-----------------------------------------------------------|---------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template specializations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |
| Component | Target | Description | Base Dependencies |
|-------------|---------------------|----------------------------------------------------------|----------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template instantiations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |

### Source

Expand Down Expand Up @@ -282,7 +282,7 @@ The folder structure mirrors other RAPIDS repos, with the following folders:
- `util`: Various reusable tools and utilities for accelerated algorithm development
- `internal`: A private header-only component that hosts the code shared between benchmarks and tests.
- `scripts`: Helpful scripts for development
- `src`: Compiled APIs and template specializations for the shared libraries
- `src`: Compiled APIs and template instantiations for the shared libraries
- `template`: A skeleton template containing the bare-bones file structure and cmake configuration for writing applications with RAFT.
- `test`: Googletests source code
- `docs`: Source code and scripts for building library documentation (Uses breath, doxygen, & pydocs)
Expand Down
297 changes: 124 additions & 173 deletions cpp/CMakeLists.txt

Large diffs are not rendered by default.

4 changes: 0 additions & 4 deletions cpp/bench/ann/src/raft/raft_benchmark.cu
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@
#include <type_traits>
#include <utility>

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include "../common/ann_types.hpp"
#include "../common/benchmark_util.hpp"
#undef WARP_SIZE
Expand Down
6 changes: 1 addition & 5 deletions cpp/bench/ann/src/raft/raft_ivf_flat.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,8 @@
*/
#include "raft_ivf_flat_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfFlatGpu<float, int64_t>;
template class RaftIvfFlatGpu<uint8_t, int64_t>;
template class RaftIvfFlatGpu<int8_t, int64_t>;
} // namespace raft::bench::ann
} // namespace raft::bench::ann
1 change: 1 addition & 0 deletions cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#include <raft/neighbors/ivf_flat_types.hpp>
#include <raft/util/cudart_utils.hpp>
#include <rmm/device_uvector.hpp>
#include <rmm/mr/device/pool_memory_resource.hpp>
#include <stdexcept>
#include <string>
#include <type_traits>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/ann/src/raft/raft_ivf_pq.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,6 @@
*/
#include "raft_ivf_pq_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfPQ<float, int64_t>;
template class RaftIvfPQ<uint8_t, int64_t>;
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans.cuh>
#include <raft/cluster/kmeans_types.hpp>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBenchParams {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans_balanced.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans_balanced.cuh>
#include <raft/random/rng.cuh>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBalancedBenchParams {
Expand Down
3 changes: 0 additions & 3 deletions cpp/bench/prims/distance/distance_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@
#include <common/benchmark.hpp>
#include <raft/distance/distance.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
4 changes: 1 addition & 3 deletions cpp/bench/prims/distance/fused_l2_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,8 @@

#include <common/benchmark.hpp>
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/distance/kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

#include <common/benchmark.hpp>
#include <memory>
#include <raft/core/device_resources.hpp>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/distance/masked_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,6 @@
#include <raft/random/rng.cuh>
#include <raft/util/cudart_utils.hpp>

#ifdef RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

namespace raft::bench::distance::masked_nn {

// Introduce various sparsity patterns
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/matrix/select_k.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@
#include <raft/sparse/detail/utils.h>
#include <raft/util/cudart_utils.hpp>

#if defined RAFT_COMPILED
#include <raft/matrix/specializations.cuh>
#endif

#include <raft/matrix/detail/select_radix.cuh>
#include <raft/matrix/detail/select_warpsort.cuh>
#include <raft/matrix/select_k.cuh>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/knn.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
#include <raft/neighbors/ivf_pq.cuh>
#include <raft/spatial/knn/knn.cuh>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include <rmm/mr/device/managed_memory_resource.hpp>
#include <rmm/mr/device/per_device_resource.hpp>

Expand Down
5 changes: 0 additions & 5 deletions cpp/bench/prims/neighbors/refine_float_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations/refine.cuh>
#include <raft/spatial/knn/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/refine_uint8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
1 change: 1 addition & 0 deletions cpp/doxygen/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -918,6 +918,7 @@ EXCLUDE_SYMLINKS = NO
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories for example use the pattern */test/*

# TODO: remove specializations from exclude patterns when headers have been removed.
EXCLUDE_PATTERNS = */detail/* \
*/specializations/* \
*/thirdparty/*
Expand Down
1 change: 1 addition & 0 deletions cpp/include/raft/cluster/detail/kmeans_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
#include <raft/distance/distance.cuh>
#include <raft/distance/distance_types.hpp>
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/linalg/reduce_rows_by_key.cuh>
#include <raft/linalg/unary_op.cuh>
#include <raft/matrix/gather.cuh>
Expand Down
12 changes: 5 additions & 7 deletions cpp/include/raft/cluster/specializations.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,10 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __CLUSTER_SPECIALIZATIONS_H
#define __CLUSTER_SPECIALIZATIONS_H

#pragma once

#include <raft/distance/specializations.cuh>
#include <raft/neighbors/specializations.cuh>

#endif
#pragma message( \
__FILE__ \
" is deprecated and will be removed." \
" Including specializations is not necessary any more." \
" For more information, see: https://docs.rapids.ai/api/raft/nightly/using_libraft.html")
1 change: 1 addition & 0 deletions cpp/include/raft/core/mdarray.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include <stddef.h>

#include <raft/core/detail/macros.hpp>
#include <raft/core/device_resources.hpp>
#include <raft/core/host_device_accessor.hpp>
#include <raft/core/mdspan.hpp>
#include <raft/core/mdspan_types.hpp>
Expand Down
3 changes: 2 additions & 1 deletion cpp/include/raft/core/resource/device_memory_resource.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include <raft/core/resource/resource_types.hpp>
#include <raft/core/resources.hpp>
#include <rmm/mr/device/device_memory_resource.hpp>
#include <rmm/mr/device/per_device_resource.hpp>

namespace raft::resource {
class device_memory_resource : public resource {
Expand Down Expand Up @@ -72,4 +73,4 @@ inline void set_workspace_resource(resources const& res, rmm::mr::device_memory_
{
res.add_resource_factory(std::make_shared<workspace_resource_factory>(mr));
};
} // namespace raft::resource
} // namespace raft::resource
3 changes: 2 additions & 1 deletion cpp/include/raft/core/resources.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "resource/resource_types.hpp"
#include <algorithm>
#include <mutex>
#include <raft/core/error.hpp> // RAFT_EXPECTS
#include <raft/core/logger.hpp>
#include <string>
#include <vector>
Expand Down Expand Up @@ -128,4 +129,4 @@ class resources {
mutable std::vector<pair_res_factory> factories_;
mutable std::vector<pair_resource> resources_;
};
} // namespace raft
} // namespace raft
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@
#pragma once

#include "gram_matrix.cuh"
#include <raft/util/cuda_utils.cuh>

#include <raft/distance/detail/kernels/rbf_fin_op.cuh>
#include <raft/distance/distance.cuh>
#include <raft/linalg/gemm.cuh>
#include <raft/util/cuda_utils.cuh>

namespace raft::distance::kernels::detail {

Expand Down Expand Up @@ -353,7 +354,7 @@ class RBFKernel : public GramMatrixBase<math_t> {
math_t gain = this->gain;
using index_t = int64_t;

auto fin_op = [gain] __device__(math_t d_val, index_t idx) { return exp(-gain * d_val); };
rbf_fin_op fin_op{gain};
raft::distance::distance<raft::distance::DistanceType::L2Unexpanded,
math_t,
math_t,
Expand Down
51 changes: 51 additions & 0 deletions cpp/include/raft/distance/detail/kernels/rbf_fin_op.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
* Copyright (c) 2019-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

/*
* This file defines rbf_fin_op, which is used in GramMatrixBase.
*
* This struct has been moved to a separate file, so that it is cheap to include
* in distance/distance-ext.cuh, where an instance of raft::distance::distance
* with the rbf_fin_op is instantiated.
*
*/

#include <raft/core/math.hpp> // raft::exp
#include <raft/util/cuda_dev_essentials.cuh> // HD

namespace raft::distance::kernels::detail {

/** @brief: Final op for Gram matrix with RBF kernel.
*
* Calculates output = e^(-gain * in)
*
*/
template <typename OutT>
struct rbf_fin_op {
OutT gain;

explicit HD rbf_fin_op(OutT gain_) noexcept : gain(gain_) {}

template <typename... Args>
HDI OutT operator()(OutT d_val, Args... unused_args)
{
return raft::exp(-gain * d_val);
}
}; // struct rbf_fin_op

} // namespace raft::distance::kernels::detail
Loading