Question about the clustering the Binary data. #3573
Replies: 4 comments
-
Binary data is converted to float before clustering. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the reply. |
Beta Was this translation helpful? Give feedback.
-
Hi sorry to bump but I would also like to know more about this question. In IndexBinaryIVF I see that there is a binary to float conversion then a indexFlatL2 is used: faiss/faiss/IndexBinaryIVF.cpp Lines 251 to 269 in 33c0ba5 Does this mean that the BinaryIVF index has its clusters created based on the euclidean distance between the binary vectors? and not the hamming distance between them? |
Beta Was this translation helpful? Give feedback.
-
At the index training stage, binary vectors are converted to float vectors and its centroids are float vectors so we use euclidean distance, when training stage is finished, centroids are converted to binary vectors and start the index assigning stage, assigning is based on the hamming distance between binary vectors. |
Beta Was this translation helpful? Give feedback.
-
From my understanding, training IndexBinaryIVF already needs to cluster the data to obtain some centroids.
Then why the faiss.Kmeans doesn't support binary data?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions