Speed BM25 when increase Top K retrieval result #1228

hieudx149 · 2022-06-29T02:12:35Z

hieudx149
Jun 29, 2022

Hi,
I did some experiments, when i increase the number of results returned by bm25, the search speed slows down. I find this confusing because obviously as far as I know when doing a search bm25 is already doing a search on the entire corpus so returning 10, 20 or 10000 results has no effect on speed ?
Can you explain to me why the number of results returned affects the search speed ? Or did i miss something ?

Answered by lintool

Jun 29, 2022

Yes, it is well known that query latency increases as k in top-k retrieval increases. Here's a relatively recent survey that provides lots of details: https://www.nowpublishers.com/article/Details/INR-057

Insight is that top-k docs are kept in the heap during query evaluation - the larger the k, the more "work" the algorithm needs to do. Modern algorithms are fast because they are able to (heuristically) answer this question: can this document possibly be in the top k? If no, I can skip it.

View full answer

lintool · 2022-06-29T11:05:49Z

lintool
Jun 29, 2022
Maintainer

Yes, it is well known that query latency increases as k in top-k retrieval increases. Here's a relatively recent survey that provides lots of details: https://www.nowpublishers.com/article/Details/INR-057

Insight is that top-k docs are kept in the heap during query evaluation - the larger the k, the more "work" the algorithm needs to do. Modern algorithms are fast because they are able to (heuristically) answer this question: can this document possibly be in the top k? If no, I can skip it.

0 replies

hieudx149 · 2022-06-29T14:31:51Z

hieudx149
Jun 29, 2022
Author

Hi @lintool ,
Thanks for your quick and detailed answer !!!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed BM25 when increase Top K retrieval result #1228

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Speed BM25 when increase Top K retrieval result #1228

Uh oh!

hieudx149 Jun 29, 2022

Replies: 2 comments

Uh oh!

lintool Jun 29, 2022 Maintainer

Uh oh!

hieudx149 Jun 29, 2022 Author

hieudx149
Jun 29, 2022

lintool
Jun 29, 2022
Maintainer

hieudx149
Jun 29, 2022
Author