IVF-PQ tutorial notebook #1544

achirkin · 2023-05-23T07:47:55Z

Add a new folder notebooks and a notebook with an example of tweaking ivf-pq using a dataset from ann-benchmarks.com

tfeher · 2023-07-18T07:26:52Z

Let's dust this off for the coming release. I think the only thing missing is the last section about the parameters for index building. Since we will have an IVF-Flat notebook that discusses clustering parameters (n-clusters, trainset ratio), we can focus here on additional parameters for the product quantization.

…mostly n_lists)

tfeher

Thanks Artem for the notebook! This looks great, a really detailed discussion of IVF-PQ search and training. While this content is great for the deep dive blog that we are planning, I am wondering whether this is the right level of detail for the example notebook. I feel that there would be a value in making it more concise. Let's wait what the team says.

notebooks/tutorial_ivf_pq.ipynb

tfeher · 2023-07-20T13:14:54Z

notebooks/tutorial_ivf_pq.ipynb

+    }
+   ],
+   "source": [
+    "bench_k = np.exp2(np.arange(10)).astype(np.int32)\n",


I am not sure if we need a specific benchmark section here on k, It is good to point out in the text, that the number of neighbors are controlled by k, but actually benchmarking that could be postponed to the blog.

notebooks/tutorial_ivf_pq.ipynb

review-notebook-app · 2023-07-20T21:40:03Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

cjnolet · 2023-07-21T23:15:24Z

notebooks/tutorial_ivf_pq.ipynb

@@ -0,0 +1,1226 @@
+{


Most notebooks that we publish for RAPIDS have longer descriptions before the first cell. I think this is a great lead-in sentence but can you please provide more details about what the notebook is going to do? Can you move the links and descriptions for the datasets up here as well so it's more immediately obvious where they came from and that the notebook is going to use them?

Reply via ReviewNB

notebooks/tutorial_ivf_pq.ipynb

cjnolet · 2023-07-21T23:15:24Z

notebooks/tutorial_ivf_pq.ipynb

@@ -0,0 +1,1226 @@
+{


Line #2. def show_properties(obj):
This is really nice. I almost wonder if this should eventually be added to pylibraft itself (and for the corresponding __str__() methods be overloaded to print it.

Reply via ReviewNB

notebooks/tutorial_ivf_pq.ipynb

cjnolet · 2023-07-21T23:15:26Z

notebooks/tutorial_ivf_pq.ipynb

@@ -0,0 +1,1226 @@
+{


"This is useful when in conjugation with add_data_on_build = True"
I think "This can also be useful when used in conjunction with...". Or I would suggest just removing it all together. The add_data_on_build is surely a convenience, but just because it's set to false doesn't mean you can assume anything more about how many vectors are ultimately going to be added to the index.

Reply via ReviewNB

This one is a funny typo :)
Regarding the usefulness: I meant to say that if add_data_on_build = False, then kmeans_trainset_fraction doesn't make much sense: a user can just pass a smaller dataset to the same effect.
I've added a small note to clarify this.

cjnolet · 2023-07-21T23:16:49Z

@achirkin ReviewNB didn't allow me to submit a message w/ the review. It's really a delight reading through the tutorial. It's very in-depth and I think it's going to be a great resource for users. I was able to make my way through most of it, but not all of it. I'm going to try and read through the rest next week.

Co-authored-by: Tamas Bela Feher <tfeher@nvidia.com>

achirkin · 2023-07-25T17:04:32Z

@cjnolet, @tfeher many thanks for the reviews! While I still plan to do some minor changes tomorrow, I think it's ready for another round.

notebooks/tutorial_ivf_pq.ipynb

cjnolet · 2023-07-27T19:56:35Z

@achirkin minor feedback from me at this point. End-to-end, this really is a great tutorial and I'm actually thinking we might want to consider keeping this as it is even after the blog is published (and maybe providing an additional notebook with more concise and simple usage examples for the blog?)

I think developers and users alike can benefit from this notebook as it gives a pretty thorough but readable overview of IVF-PQ. Thanks again for creating this!

cjnolet

Thanks Artem! I think this notebook is good to go for 23.08 and we can revisit some portions for 23.10. Overall I think it's a great deep dive into ivf pq.

achirkin added 4 commits May 22, 2023 15:03

IVF-PQ: an intro tutorial notebook

c4cea44

Add refinement section

29b88b0

Add n_probes benchmarks

5c28c04

search params tweaking continued

f35a6a6

achirkin added feature request New feature or request non-breaking Non-breaking change 2 - In Progress Currenty a work in progress labels May 23, 2023

github-actions bot added ci CMake cpp python labels May 23, 2023

achirkin changed the base branch from branch-23.06 to branch-23.08 May 23, 2023 07:48

achirkin removed cpp CMake ci labels May 23, 2023

achirkin added 2 commits May 23, 2023 09:52

fix codespell

15f6e7a

Small text updates

b356894

achirkin self-assigned this May 30, 2023

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

a61a764

github-actions bot removed the python label May 30, 2023

cjnolet and others added 2 commits June 6, 2023 16:42

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

3b949ed

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

5f7f2ba

tfeher added the Vector Search label Jul 18, 2023

achirkin and others added 5 commits July 19, 2023 11:44

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

4c028a4

Updated the search params section and added indexing params section (…

c312ff6

…mostly n_lists)

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

8b9ae8b

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

17c71eb

Added PQ section and other minor corrections

f16cf44

achirkin marked this pull request as ready for review July 20, 2023 12:07

achirkin added 3 - Ready for Review and removed 2 - In Progress Currenty a work in progress labels Jul 20, 2023

achirkin requested a review from tfeher July 20, 2023 12:08

tfeher reviewed Jul 20, 2023

View reviewed changes

cjnolet reviewed Jul 21, 2023

View reviewed changes

achirkin and others added 4 commits July 24, 2023 16:43

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

ebd4635

Update notebooks/tutorial_ivf_pq.ipynb

922a162

Co-authored-by: Tamas Bela Feher <tfeher@nvidia.com>

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

9fffbf3

Address review comments

307263c

achirkin added 4 commits July 26, 2023 10:14

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

688c963

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

da179fe

Run the notebook in production env

f9773c1

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

95fa780

cjnolet reviewed Jul 27, 2023

View reviewed changes

notebooks/tutorial_ivf_pq.ipynb Show resolved Hide resolved

achirkin added 4 commits July 28, 2023 09:43

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

6d943a3

Address a few more review comments

6a91f44

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

b3a2501

Merge branch 'branch-23.08' into fea-ivf-pq-notebook

b22a476

cjnolet approved these changes Jul 31, 2023

View reviewed changes

raydouglass merged commit c957037 into rapidsai:branch-23.08 Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IVF-PQ tutorial notebook #1544

IVF-PQ tutorial notebook #1544

achirkin commented May 23, 2023 •

edited

Loading

tfeher commented Jul 18, 2023

tfeher left a comment

tfeher Jul 20, 2023

review-notebook-app bot commented Jul 20, 2023

cjnolet Jul 21, 2023 •

edited

Loading

cjnolet Jul 21, 2023 •

edited

Loading

cjnolet Jul 21, 2023 •

edited

Loading

achirkin Jul 25, 2023 •

edited

Loading

cjnolet commented Jul 21, 2023

achirkin commented Jul 25, 2023 •

edited

Loading

cjnolet commented Jul 27, 2023

cjnolet left a comment

IVF-PQ tutorial notebook #1544

IVF-PQ tutorial notebook #1544

Conversation

achirkin commented May 23, 2023 • edited Loading

tfeher commented Jul 18, 2023

tfeher left a comment

Choose a reason for hiding this comment

tfeher Jul 20, 2023

Choose a reason for hiding this comment

review-notebook-app bot commented Jul 20, 2023

cjnolet Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

cjnolet Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

cjnolet Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

achirkin Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

cjnolet commented Jul 21, 2023

achirkin commented Jul 25, 2023 • edited Loading

cjnolet commented Jul 27, 2023

cjnolet left a comment

Choose a reason for hiding this comment

achirkin commented May 23, 2023 •

edited

Loading

cjnolet Jul 21, 2023 •

edited

Loading

cjnolet Jul 21, 2023 •

edited

Loading

cjnolet Jul 21, 2023 •

edited

Loading

achirkin Jul 25, 2023 •

edited

Loading

achirkin commented Jul 25, 2023 •

edited

Loading