-
Notifications
You must be signed in to change notification settings - Fork 56
Support dot product similarity measure and change scoring function (Backward Compatible) #161
Conversation
@jmazanec15 @vamshin The current state of the PR treats indices for all supported space types as non-optimized indices. I realized that this works OK for new data, but has significant backward compatibility issue with old doc saved under optimized indices for l2 and cosinesimil, e.g.
To resolve that, I think one option is to treat space types that supports optimized index separately from those do not support optimized index. But before going further, I would like to hear from your comments and suggestions. |
Finished separation between optimized(l2, cosinesimil) and non-optimized(negdotprod, etc) space types for KNN index. |
…asure * master: FIX: Pass -march=x86-64 to build JNI library (opendistro-for-elasticsearch#164) FIX: Added resetState for uTs so state does not spill over (opendistro-for-elasticsearch#159)
A failed test case scenario:
root cause: lucene does not allow negative score in score collector:
We need to change the following scoring formula for negdotprod: result -> 1/(1 + result.getScore()) |
…dot-product-similarity-measure * enh/171-better-scoring-function: change function and test case Add release notes automation (opendistro-for-elasticsearch#168) FIX: fix versioning for lib artifacts (opendistro-for-elasticsearch#166)
Modified scoring function according to discussion in #171 . |
Hi @chenqi0805, Thanks for the awesome work. We will get back to this PR soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @chenqi0805 , did you get a chance to investigate why they don't support optimized index for negative dot product in NMSLIB?
@jmazanec15 Not yet. I will first check issues in nmslib and then post a question in the nmslib lobby. |
Cool, saw your comment @chenqi0805 . We could look to contribute if possible. |
We now have nmslib supporting optimized version of negative dot product starting version 2.0.8. At this point we would close this PR and come up with new PR which reads optimized version of negative dot product. Thanks for the hard work and contribution @chenqi0805. |
I would like to ask if there is any progress about the new PR? |
Issue #, if available:
#114 #171
Description of changes:
This PR supports the new space type
negdotprod
. The main issue in supporting the new space type is that nmslib only supports l2 and cosinesimil for optimized index: https://github.com/nmslib/nmslib/blob/32e6e69a574b678da0d37bece8dbe6b1b250b660/python_bindings/README.md#saving-indexes-and-datawhile for all other space types Object vector data has to be saved and loaded explicitly from
.dat
file. This PR deals with those new.dat
files. Changes includeNote:
This PR should be reviewed after #160 since it is based off the nmslib latest version.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.