Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] raft v23.06 #1549

Merged
merged 88 commits into from
Jun 7, 2023
Merged

[RELEASE] raft v23.06 #1549

merged 88 commits into from
Jun 7, 2023

Conversation

raydouglass
Copy link
Member

❄️ Code freeze for branch-23.06 and v23.06 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-23.06 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-23.06 into main for the release

jolorunyomi and others added 30 commits March 19, 2023 07:03
…23.06

Forward merge branch 23.04 into 23.06
Forward-merge branch-23.04 to branch-23.06
Forward-merge branch-23.04 to branch-23.06
Forward-merge branch-23.04 to branch-23.06
Forward-merge branch-23.04 to branch-23.06
Update the ucx-py pinning for raft-dask 23.06

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Mark Sadang (https://github.com/msadang)

URL: #1396
Resolve conflicts in auto-merger of `branch-23.06` and `branch-23.04`
This will remove 1h from our conda CI builds since we can now re-use the cached object files between `libraft` and `libraft-tests`

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Ben Frederickson (https://github.com/benfred)
  - Divye Gala (https://github.com/divyegala)

URL: #1401
This setting now matches the default behavior of the shared-action-workflows repo

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1406
This PR mainly intends to replace `device_matrix_view` for `ivf_pq::extend` to `device_vector_view`.
There are also a few updates to the documentation to reflect the current API.
The order of the arguments in the API is not touched.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1384
cc @MatthiasKohl  @bdice 

Making sure CI agrees w/ this change. @MatthiasKohl, if CI succeeds here let's try to plug the resulting conda packages into a cugraph-ops PR to make sure cugraph-ops CI is happy as well.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Divye Gala (https://github.com/divyegala)

URL: #1386
This PR is updating the runner labels to use ARC V2 self-hosted runners for GPU jobs. This is needed to resolve the auto-scalling issues.

Authors:
  - Jordan Jacobelli (https://github.com/jjacobelli)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1410
Correlation and Cosine distance both return (1 - similarity) in the pairwise distances apis, meaning that is_min_close is returning the wrong sort order for them. Fix.

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1419
Add public functions for reading and writing into individual ivf-pq lists (clusters), in the input space (reconstructed data) and in flat PQ codes.

Partially solves (IVF-PQ) #1205

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1298
The `index_` is not yet initialized. To construct the dataset view, we need to use the `dim_` variable which was set in the constructor.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1427
This PR removes the verbose flag from wheel pytest commands

Authors:
  - Sevag H (https://github.com/sevagh)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1424
This PR updates raft to use `#include <nvtx3/nvToolsExt.h>` instead of `#include <nvToolsExt.h>`. This ensures we fetch the header-only NVTX v3. See NVTX docs for more information: https://nvidia.github.io/NVTX/#c-and-c

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Ben Frederickson (https://github.com/benfred)

URL: #1431
Stop using the MetricProcessor code to preprocess the inputs to the bfknn calls. Since the pairwise distance API supports both cosine and correlation distance, this wasn't required anymore - and it introduced NaN values to the input when passed a dataset with one of the rows being all zero.

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1426
mfoerste4 and others added 17 commits May 17, 2023 20:20
A assertion could fail if the result kernel matrix is column major but has only a single row. In this case the old implementation identified the matrix as row major instead and the following stride-check failed.

Now the check is extended to identify both layouts. The only case where it cannot be exactly determined is a 1x1 matrix in which the following code just picks column major as default.
CC @cjnolet @tfeher

Authors:
  - Malte Förster (https://github.com/mfoerste4)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1521
This uses the select_k dataset from #1497 to learn a heuristic of the fastest select_k variant based off the rows/ cols/k of the input. This heuristic is modelled as a DecisionTree, which is automatically exported in C++ code that is compiled into RAFT. This lets us learn a function to pick the fastest select_k method - which requires only a few if statements in C++ code, making it very cheap to evaluate.

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1523
…es (#1470)

This PR adds bunch of new device reduction functions such as: 

- Generic device reductions that takes reduction operator as argument.
- Ranked reductions to return the index/rank of the reduced value.
- Weighted random reduction to have probabilistic reduction using conditional probability.
- Binary reduction to reduce binary values more efficiently.

There are tests implemented for all device reduction operations.

This PR also separates warp primitives to the `warp_primitives.cuh`.
All reduction functions are moved to `reduction.cuh`

Authors:
  - Akif ÇÖRDÜK (https://github.com/akifcorduk)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #1470
This PR configures `raft` docs builds to also run nightly (not just on PR merges only)

Authors:
  - Jake Awe (https://github.com/AyodeAwe)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1520
The TiledKNNTest test was faiiling - and it seems to be because the matrix::select_k code isn't guaranteed to return elements in sorted order. The test was expecting outputs to be sorted, and was failing because of it. This change fixes the test to sort the outputs before comparing.

Closes #1526

Authors:
  - Ben Frederickson (https://github.com/benfred)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1533
Fix ivf-flat's `recompute_internal_state` incorrectly using the amortized list sizes to compute the index size.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1525
This PR adds Python API for IVF-Flat serialization.

closes #752

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1516
-- as the kernel arch given by the cudaFuncAttribute ptxVersion depends on what archs the kernel was compiled for
we should renam kernel_runtime_arch() as kernel_virtual_arch().
-- accordingly update comments to reflect this.

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1536
The input validation code wasn't being triggered for the python bfknn api, causing invalid output when passed col-major inputs. Fix.

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1537
This PR updates CAGRA::search to support `uint64_t` in the index data type. This update is required to implement the RAFT ANN benchmark for CAGRA.

Authors:
  - tsuki (https://github.com/enp1s0)
  - Ben Frederickson (https://github.com/benfred)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #1514
A prototype that introduces a per-sample filtering for IVFPQ search. Please feel free to use it as a foundation for the future change, if appropriate, because the code is functional, but is not super clean-and-neat.

The diff introduces a template parameter called `SampleFilterT`. An instance is expected 
* to be `SampleFilterT()` constructible (which was mostly needed to define a default behavior in the form of `SampleFilterT sample_filter=SampleFilterT()`, see below)
* to provide a `inline __device__ bool operator(...)`  that returns `true` is a given sample is valid for being used against a given query in IVFPQ search

The default filter (that I set as a default one in certain facilities in the form of `typename SampleFilterT = NoneSampleFilter` in order not to modify way to many files) allows all samples to be used:
```
struct NoneSampleFilter {
  inline __device__ __host__ bool operator()(
    // query index
    const uint32_t query_ix,
    // the current inverted list index
    const uint32_t cluster_ix,
    // the index of the current sample inside the current inverted list
    const uint32_t sample_ix
  ) const {
    return true;
  }
};
```
Here `__host__` is needed for a CPU-based testing only.
Also, I've provided an implementation of `BitMaskSampleFilter` that allows to filter samples based on a bit mask, as an example. The implementation was tested in the semi-production environment.

All the filter-related files were added to `cpp/include/raft/neighbors/detail/sample_filter.cuh`.

I did not change the default `ivf_pq_search()` method remains unchanged, but one more `ivf_pq_search_with_filtering()` method with an additional template argument `SampleFilterT` and one more input parameter was introduced.
```
template <typename T, typename IdxT, typename SampleFilterT>
void search_with_filtering(raft::device_resources const& handle,
            const raft::neighbors::ivf_pq::search_params& params,
            const index<IdxT>& idx,
            const T* queries,
            uint32_t n_queries,
            uint32_t k,
            IdxT* neighbors,
            float* distances,
            rmm::mr::device_memory_resource* mr = nullptr,
            SampleFilterT sample_filter = SampleFilterT());
```

All the current instantiations use `NoneSampleFilter` only.

I've used `SampleFilterT sample_filter` parameters passing instead of `const SampleFilterT sample_filter` in the function calls in order to be able to add some debugging facilities to a filter and with the hope that the compiler is smart enough to understand the de-facto constness if needed.

The filter does not take a computed distance score into account by design, thus the current implementation cannot have a distance threshold. This can be easily changed, if appropriate.

It is still questionable to me whether this filtering needs to be injected right inside the search kernel instead of doing post-processing, please let me know if you have any thoughts on the topic.

I'm happy to address the comments.

Thanks.

Authors:
  - Alexander Guzhva (https://github.com/alexanderguzhva)
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1513
…uments (#1539)

closes #1357

breaking change: the type of argument mdspans has slightly changed (second template parameter fixed to `uint32_t`)

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1539
Remove the device_resources.cuh include from linalg::map.

For the implicit integration, I don't have libraries like cusolver in the CI environment, and the build is currently failing with errors like

```
/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/raft-src/cpp/include/raft/core/device_resources.hpp:31:10:
       fatal error: cusolverDn.h: No such file or directory
```

(from https://github.com/benfred/implicit/actions/runs/5033022104/jobs/9026999533?pr=656)

Fix by not including device_resources here

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1540
Align with the rest of RAPIDS on these requirements. Also needed for CUDA 12 support.

Authors:
   - https://github.com/jakirkham

Approvers:
   - Corey J. Nolet (https://github.com/cjnolet)
   - Ray Douglass (https://github.com/raydouglass)
   - Bradley Dice (https://github.com/bdice)
mdoijade and others added 2 commits June 6, 2023 12:49
…me buffer but with different shape and add unit test for such case. (#1571)

-- This is how tiled_brute_force_knn may use pairwise distance API hence assuming when X == Y the buffer has same shape is incorrect.

Authors:
   - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
   - Tamas Bela Feher (https://github.com/tfeher)
   - Corey J. Nolet (https://github.com/cjnolet)
@raydouglass raydouglass merged commit f626bf1 into main Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.