-
Is it possible to retrieve only from the subset of the corpus? I would like to index the whole corpus but then only retrieve from a subset of the corpus. E.g. I have 1M sentences as corpus, I index it to get good word frequencies for the BM25 scores. Then I would like to retrieve k=1 but only from the first 1k sentences. Is this possible? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
You can use the Here's an example: bm25s/tests/core/test_retrieve.py Lines 71 to 105 in 8b5ff10 |
Beta Was this translation helpful? Give feedback.
You can use the
weight_mask
parameter in retrieve and use an array of 1 for docs you want to retrieve, and 0 for others.Here's an example:
bm25s/tests/core/test_retrieve.py
Lines 71 to 105 in 8b5ff10