Clarification Needed: Differences Between Faiss IndexFlatIP Search Results and Cosine Similarity #3583
Replies: 3 comments
-
Faiss does not L2 normalize either the query, not the database vectors. Have you done this normalization on the vectors yourself before adding them to the index and querying? |
Beta Was this translation helpful? Give feedback.
-
Yes, I have indeed applied L2 normalization to the vectors before adding them to the index and querying. Any insights into this would be greatly appreciated. |
Beta Was this translation helpful? Give feedback.
-
hi @YoungjaeDev, as @algoriddle mentioned, you need to normalise all vectors before constructing the index and use the inner product metric. Could you provide a toy example where you reproduce the issue so this is actionable? |
Beta Was this translation helpful? Give feedback.
-
Summary
Platform
OS: Ubuntu20
Faiss version: lastest
Installed from: sourec build
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
I am reaching out with a query regarding some inconsistencies I've encountered while using Faiss for indexing and search operations. My primary concern revolves around the discrepancy between results obtained from a search function on an IndexFlatIP index and those calculated using
cosine similarity
(both scipy and scikit-learn) on the same dataset. Values from faiss search have a higher matching rateSpecifically, there are occasional mismatches between the distance values (D) from cosine_similarity(np.expand_dims(face_embedding, axis=0), index_np) and those obtained from index.search. While these discrepancies aren't constant, they are noticeable in certain instances.
I am curious about how Faiss handles distance calculations and whether there is any additional preprocessing applied to feature vectors post L2-normalization within Faiss. Any clarification or additional information on this matter would be immensely helpful.
Your insights and experiences could greatly assist in enhancing my understanding and in finding a resolution to these inconsistent results. Thank you in advance for your time and assistance.
Best regards,
Beta Was this translation helpful? Give feedback.
All reactions