Support range queries (neighbors within some distance) #380
Replies: 7 comments
-
Yes and also if we want give the ability to implement an external algorithm such as DBSCAN (and similars) which requires range queries, it's basically the epsilon parameter. |
Beta Was this translation helpful? Give feedback.
-
Just thinking out loud, I believe for the dense vector case we could use a structure such as a kd-tree during insertion time. |
Beta Was this translation helpful? Give feedback.
-
Hi @priamai, thanks for your input
Where exactly would the kd-tree reside? Everything we do in Elastiknn must be stored in Lucene indexes. That's what makes this problem pretty tough. |
Beta Was this translation helpful? Give feedback.
-
Yes what about storing the KDTree as a nested document in a hidden index ? The tree will reference those points by their document id peraphs ? |
Beta Was this translation helpful? Give feedback.
-
I'm not really sure how that would work. I would need the idea spelled out a bit more. Like, what exactly happens when a new vector is indexed? What exactly happens when we do a range query? Another idea I've had is this: Given a vector and a distance, just internally re-run the standard nearest-neighbors query query several times with incrementally larger values of k. It's pretty wasteful, but would at least satisfy the API. |
Beta Was this translation helpful? Give feedback.
-
Yes I will mockup some code in python to get an idea. Yes I like the second approach, it will not be efficient but at least we have an API we can replace eventually with a faster approach. |
Beta Was this translation helpful? Give feedback.
-
I don't plan to implement this. Will happily review if someone else takes a pass at it. |
Beta Was this translation helpful? Give feedback.
-
At least one of the datasets in the big-ann-benchmarks challenge will require support for range queries. This will take some rethinking of the existing query model and should be an interesting problem to solve.
Beta Was this translation helpful? Give feedback.
All reactions