Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] v22.04 #612

Merged
merged 207 commits into from
Apr 6, 2022
Merged

[RELEASE] v22.04 #612

merged 207 commits into from
Apr 6, 2022

Conversation

raydouglass
Copy link
Member

No description provided.

raydouglass and others added 30 commits July 15, 2021 17:02
[gpuCI] Forward-merge branch-21.08 to branch-21.10 [skip ci]
The original approach of using FetchContent naively has a subtle
bug when multiple projects that use rapids-cmake are combined together inside as sibling projects. This bug causes any
`include(rapids-*)` commands to fail, causing CMake errors.

Bug using `RAPIDS.cmake` we can resolve this issue and remove
the new complex logic from each consumer.

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #298
[gpuCI] Forward-merge branch-21.08 to branch-21.10 [skip ci]
…303)

This PR will remove max version pinning for dask & distributed for development purposes.

ref: rapidsai/cudf#8881

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - https://github.com/jakirkham

URL: #303
This PR fixes current RAFT C++/CUDA compilation warnings and turns on -Wall to treat warnings as errors.

Fixes #225
Fixes #289

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #299
[gpuCI] Forward-merge branch-21.08 to branch-21.10 [skip gpuci]
`mamba` was recently added to gpuCI build environment, testing usage and solvability with this PR which should speed up build times.

Authors:
  - Dillon Cullinan (https://github.com/dillon-cullinan)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #295
Adds `-Werror=all-warnings` NVCC flag to ensure all CUDA device code warnings are treated as errors. Only enabled on CUDA 11.2+ because CUDA 11.0 has PTXAS warnings that go away in newer CUDA versions.

Missed this in #299.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #307
Warnings missed in #299...

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #311
Fix Forward-Merge Conflicts [skip ci]
Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Jordan Jacobelli (https://github.com/Ethyling)

URL: #319
…on distance metrics support (#306)

This PR introduces the following distances:
- Hamming
- Jensen-Shannon
- Russell-Rao
- KL-Divergence
- Correlation
with unit tests for each of them.

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Brad Rees (https://github.com/BradReesWork)

URL: #306
Miscellaneous updates to solve tech debts in RAFT :
- [x] Removal of handle host and device allocators
- [x] Addition of a `get_thrust_policy` method to the handle
- [x] Usage of `get_thrust_policy` where handle is available
- [x] Removal of `rmm::device_vector`
- [x] Use of RMM device allocator in the `raft::allocate` function
- [x] Creation of an allocation + deallocation helper system
- [x] Usage of `rmm::exec_policy` instead of `thrust::cuda::par.on` when no handle is available

Authors:
  - Victor Lafargue (https://github.com/viclafargue)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #286
This combines some general CMake style cleanup and brings new rapids-cmake features to RAFT including:

- Usage of `rapids_cmake_install_lib_dir` to make sure we install raft correctly on non-debain based distro's ( lib64 ), while also handling conda installation requirements ( always lib no matter the distro )
- Usage of `rapids_cpm` pre-configured pacakges
- Removal of early termination before `rapids_cpm_find` since a better solution now exists ( rapidsai/rapids-cmake#49 )

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #320
The current ```bcast``` function takes a single ```value_t*``` pointer (MPI style) for both input (if root) and output (if non-root).

This does not compile if we have ```const value_t*``` pointer for input. This PR adds a ```bcast``` function that takes separate ```const value_t*``` input and ```value_t```` output pointers (NCCL style).

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #328
Fix wrong `lda` parameter in raft::linalg::gemv. lda should always be along `n_rows` direction, independently of `trans_a`. I also took a liberty to add couple more overloads and documentation.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Brad Rees (https://github.com/BradReesWork)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #327
…y gmem to store intermediate distances (#324)

benchmarking with cuml python interface kNN datasets (it claims to generate gaussian distribution) tried till 200k x 128 database/query vectors.
found some different behavior on my small GPU GP107 vs on GA102(Tesla A40)
on GP107 fused L2 kNN is slower on larger datasets
on GA102 fused L2 kNN is always faster like approx **1.15x-1.5x** for all datasets I tried (except 200k x 128).

I will also have L2 expanded version of fused L2 kNN in a separate PR due to which on larger dimension like > 128 distance computation from fused L2 kNN won't become bottleneck.

There is scope to optimize the distance computation in fused L2 kNN as there is no usage of vectorized LDG/STS in it.

Overall it looks that fused L2 kNN is better on GPUs with decent compute power but not on small old GPUs like GP107.

& benchmarking with cuml cpp kNN regression tests the performance is 
On A30 (GA100) , For NN == 64, resultant Dist matrix 1M x 1M,
Fused L2 kNN = 11550ms
FAISS kNN = 23933 ms.
**Overall 2.07x faster**
And for NN == 32, it is **1.43x faster**
runtimes for NN == 32,
Fused L2 kNN = 11198ms
FAISS kNN = 16124ms

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #324
When we make a new raft version, we need to also bump the rapids-cmake version at the same time. Otherwise we will get the previous releases dependencies by mistake.

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #331
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
This PR is a proof of concept to use the triangle inequality to prune the tree of  <img src="https://latex.codecogs.com/gif.latex?O(n^2)" title="O(n^2)" /> exhaustive distance computations into something smaller, such as on the order of <img src="https://latex.codecogs.com/gif.latex?O(c^{3/2}&space;*&space;\sqrt{n})" title="O(c^{3/2} * \sqrt{n})" /> where c is called an expansion constant, based on the dimensionality. 

This should (hopefully) be able to benefit both sparse and dense k-nearest neighbors and all algorithms that use them, hopefully providing a significant speedup for our sparse semirings primitive when only the k-nearest neighbors are desired. 


The goal here is to construct a tree out of the random ball cover algorithm such that we can utilize it in algorithms which would otherwise be able to make efficient use of a ball tree. However, there are additional challenges to this algorithm on the GPU, such as being able to batch the tree lookups.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - William Hicks (https://github.com/wphicks)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #213
raydouglass and others added 22 commits March 22, 2022 00:15
Remove line referencing deleted file

Authors:
  - Ray Douglass (https://github.com/raydouglass)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #560
`clang-format -version` installed with apt on Ubuntu reports `Ubuntu clang-format version 11.0.0-2~ubuntu20.04.1`, so we need an unanchored search here.

(Equivalent to rapidsai/cuml@dafcd6f.)

Authors:
  - Zach Bjornson (https://github.com/zbjornson)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #573
…565)

Fixes errors configuring RAFT now that rapids-cmake [is enforcing](rapidsai/rapids-cmake#168) GTest v1.10.0.

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #565
Including `raft/linalg/transpose.cuh` appears to work today, but a few weeks ago it didn't because of these missing includes. Either way, these should be here because they're used.

I can't figure out how to get include-what-you-use to process .cuh files, but that would be a nice check for all of the RAPIDS repos.

Authors:
  - Zach Bjornson (https://github.com/zbjornson)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #575
Pin changes to be in-line with : rapidsai/cudf#10481

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #581
This PR includes a few fixes to support source-only builds:
1. Defines linkage to `cuco::cuco` if the `RAFT_ENABLE_cuco_DEPENDENCY` variable is set, not if `cuco_ADDED` is true
2. Adds a flag to control the `EXCLUDE_FROM_ALL` for the faiss dependency. This flag can be off for conda builds, but true for C++-only source builds
3. Writes `version_config.hpp` header and fixes a potential GoogleBench issue

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #583
I recently committed a config file to be used by the [rapidsai/ops-bot](https://github.com/rapidsai/ops-bot/) and in hindsight, I should've had the new `external_contributors` functionality set to `false` until we're ready to roll it out everywhere. This PR fixes that.

Authors:
   - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
Finalizing some more bits of the docs. This has also included cleaning up several header files to make the docs a little more clean.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #566
This PR updates the commit hash for cuCollections to include the changes in NVIDIA/cuCollections#138. cudf depends on those changes in 22.04, and some of our CI builds of cudf are finding the version of cuco installed by raft and then failing, so I'm making this change to 22.04 even though we're in code freeze. Happy to work with ops an an alternate solution if there are concerns about the update, though.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #592
Integrate two new implementations for knn's `select_k` function.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #551
CMake 3.23 has a bug that breaks our conda-build based builds in CI, this avoids that issue.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #600
We don't have arm64 packages for `rapids-doc-env`, so only run the doc builds for x86_64

Authors:
   - Ray Douglass (https://github.com/raydouglass)

Approvers:
   - AJ Schmidt (https://github.com/ajschmidt8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.