Replies: 1 comment
-
In HNSW, neighbors are connected based on proximity within graphs. Even though index 4 may be connected to index 10, the graph traversal can miss certain connections due to the local nature of the search. This happens especially in larger graphs with high dimensionality (2048 is considered high dimension), where shortcuts between clusters or dense regions can be missed. Increasing k helps, but it doesn’t guarantee that all expected results will be found. I suggest increasing I hope this helps!
|
Beta Was this translation helpful? Give feedback.
-
Summary
Platform
OS: Linux
Faiss version:
Installed from:
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
I'm conducting an experiment using FAISS to compare the performance of pre-filtering and post-filtering with the HNSW index. However, I'm encountering unexpected results during post-filtering.
Experiment Details:
Input Size: 105,100
Dimension: 2048
Filtering Condition: 22,069 items satisfy this condition.
Expected Output: 200 specific indexes should be returned after searching.
Search Process: I start with k=200 and double it if I don't retrieve 200 results, continuing until k reaches half the input size (52,550).
Issue:
The accuracy, defined as the intersection of expected and actual results divided by the expected results, is only 0.1. This low accuracy persists across different tests. The search function does not consistently return the desired indexes, even though they are connected in the graph structure. For instance, if index 4 is an answer and is connected to index 10, the search sometimes fails to return index 4, regardless of how much I increase k.
Question:
Why might the search function be failing to return the expected results, even when the nodes are connected in the graph? Any advice on how to improve the accuracy of my experiment would be appreciated.
Beta Was this translation helpful? Give feedback.
All reactions