Approximate Nearest Neighbors #2780

viclafargue · 2020-09-01T17:25:08Z

This PR will enable the usage of multiple KNN strategies as alternatives to the current default bruteforce method. See #574

GPUtester · 2020-09-01T17:25:34Z

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

cjnolet

I'm super excited to start supporting approximate algorithms in cuml. Just took a quick look and added some initial feedback

cpp/include/cuml/neighbors/knn.hpp

cpp/src_prims/selection/knn.cuh

cpp/include/cuml/neighbors/knn.hpp

python/cuml/neighbors/nearest_neighbors.pyx

cjnolet · 2020-09-14T15:56:25Z

@WXBN have you collected any timings on these indices vs the brute force indices?

cjnolet

This is looking really good so far. I did a first-pass through the code.

python/cuml/neighbors/nearest_neighbors.pyx

python/cuml/neighbors/__init__.py

python/cuml/neighbors/nearest_neighbors.pyx

python/cuml/test/test_nearest_neighbors.py

cjnolet · 2020-09-16T22:09:04Z

cpp/include/cuml/neighbors/knn.hpp

@@ -36,36 +38,87 @@ enum MetricType {
  METRIC_Correlation
 };

+struct knnIndex {
+  faiss::gpu::StandardGpuResources *gpu_res;


Either before or shortly after this PR is merged, we need to update FAISS in cuML and use their new pluggable memory manager feature (facebookresearch/faiss#1203). While the brute-force computation uses only a very small workspace, the approximate Index variants put FAISS in complete control of the memory space of the index (through the StandardGPUResources).

viclafargue · 2020-11-30T16:12:57Z

Sorry, I forgot about this one. Sure it's probably preferable to defer it to 0.18. Also, I'll have to check again the automatic choice of parameters and maybe move it to Python code.

viclafargue · 2020-12-08T15:47:16Z

The code was updated to current branch-0.18. Automated parameters determinations are now done from the Python layer of the code. It's easier for maintainability and testing. I also modified the automatic determination of ivfpq parameters to prioritize speed over accuracy. It's still passing tests and much faster.

cjnolet · 2020-12-08T19:50:03Z

@viclafargue, it looks like CI ran out of space in the tests. The tests also look pretty exhautive in terms of parameters. Any opportunities to prune the parameter grid?

14:19:05 cuml/test/test_nearest_neighbors.py::test_ivfpq_pred[8-512-10000-False-4-32-8] Faiss assertion 'err == cudaSuccess' failed in void faiss::gpu::freeMemorySpace(faiss::gpu::MemorySpace, void*) at gpu/utils/MemorySpace.cpp:72; details: Failed to cudaFree pointer 0x7f41829fe200 (error 700 an illegal memory access was encountered)

cjnolet

LGTM

cjnolet · 2020-12-11T20:28:04Z

@viclafargue, I think the changelog is automatic now. Can you remove the changelog update?

cjnolet · 2020-12-11T21:40:09Z

Talked to Victor and offered to make the changelog update so we can get this in.

ajschmidt8 · 2020-12-11T22:01:56Z

rerun tests

Multiple KNN strategies (implementing PQ)

4124e5a

viclafargue requested review from a team as code owners September 1, 2020 17:25

cjnolet reviewed Sep 1, 2020

View reviewed changes

Multiple improvements

132acab

viclafargue force-pushed the fea-multiple-knn-strategies branch from 22d9ffa to 132acab Compare September 3, 2020 15:38

viclafargue added 11 commits September 3, 2020 17:30

Adding nprobe parameter

fb82831

Adding support for GpuIndexIVFFlat and GpuIndexIVFScalarQuantizer

729b3a4

Completing documentation

2fa57fa

Small fixes

5c44a18

Merge branch 'branch-0.16' into fea-multiple-knn-strategies

8a46e42

Adding test

9c05d87

Improving tests

fa7d004

Corrections & improvements

2c66690

Check style

51362a7

Update changelog

5837cc8

Adding include

49bb435

viclafargue mentioned this pull request Sep 10, 2020

[FEA] ANN for KNN Classifier & Regressor #2809

Open

JohnZed changed the title ~~[WIP] Multiple KNN strategies~~ [REVIEW] Multiple KNN strategies Sep 14, 2020

JohnZed added the 3 - Ready for Review Ready for review by team label Sep 14, 2020

cjnolet requested changes Sep 16, 2020

View reviewed changes

dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Sep 16, 2020

Merge branch-0.16

0d3cfe1

viclafargue mentioned this pull request Sep 24, 2020

[FEA] Enable more metrics for Approximate Nearest Neighbors methods #2868

Closed

viclafargue added 3 commits September 24, 2020 16:03

First part of requested changes

f39cc21

ANN parameters creation in separate file

1a3af42

Updating ANN methods documentation

636ce58

viclafargue changed the base branch from branch-0.17 to branch-0.18 December 8, 2020 10:28

viclafargue added breaking Breaking change feature request New feature or request labels Dec 8, 2020

viclafargue added 3 commits December 8, 2020 13:26

Merge branch-0.18

5eb0060

Automated parameter determination to Python code

046a127

Update changelog according to PR name

c60d54c

viclafargue changed the title ~~[REVIEW] Multiple KNN strategies~~ [REVIEW] Approximate Nearest Neighbors Dec 8, 2020

viclafargue added 5 commits December 9, 2020 11:31

Lower values for ivfpq testing + testing trim down

96bddef

Merge branch 'branch-0.18' into fea-multiple-knn-strategies

8775a42

Update ivfpq test

6a97101

Force index memory release in tests

7e8ee31

Merge branch 'branch-0.18' into fea-multiple-knn-strategies

937e799

cjnolet approved these changes Dec 11, 2020

View reviewed changes

cjnolet added 4 - Waiting on Author Waiting for author to respond to review and removed 4 - Waiting on Author Waiting for author to respond to review labels Dec 11, 2020

cjnolet changed the title ~~[REVIEW] Approximate Nearest Neighbors~~ Approximate Nearest Neighbors Dec 11, 2020

Removing changelog update

902849d

cjnolet added 5 - Ready to Merge Testing and reviews complete, ready to merge 6 - Okay to Auto-Merge and removed 4 - Waiting on Author Waiting for author to respond to review 5 - Ready to Merge Testing and reviews complete, ready to merge labels Dec 11, 2020

rapids-bot bot merged commit bd43f32 into rapidsai:branch-0.18 Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Approximate Nearest Neighbors #2780

Approximate Nearest Neighbors #2780

viclafargue commented Sep 1, 2020

GPUtester commented Sep 1, 2020

cjnolet left a comment

cjnolet commented Sep 14, 2020

cjnolet left a comment

cjnolet Sep 16, 2020

viclafargue commented Nov 30, 2020

viclafargue commented Dec 8, 2020

cjnolet commented Dec 8, 2020

cjnolet left a comment

cjnolet commented Dec 11, 2020

cjnolet commented Dec 11, 2020

ajschmidt8 commented Dec 11, 2020

Approximate Nearest Neighbors #2780

Approximate Nearest Neighbors #2780

Conversation

viclafargue commented Sep 1, 2020

GPUtester commented Sep 1, 2020

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet commented Sep 14, 2020

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet Sep 16, 2020

Choose a reason for hiding this comment

viclafargue commented Nov 30, 2020

viclafargue commented Dec 8, 2020

cjnolet commented Dec 8, 2020

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet commented Dec 11, 2020

cjnolet commented Dec 11, 2020

ajschmidt8 commented Dec 11, 2020