-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NGT performance #29
Comments
It seems ONNG can be enabled in ngtpy, but it is currently not documented. However, there is an example here: yahoojapan/NGT#30 |
New NGT release 1.7.10 should fix this: https://github.com/yahoojapan/NGT/releases/tag/v1.7.10 |
1.8.0 brought docs for ONNG. It is already activate here, but index building is extremely slow due to difficult parameterization. Need to check. |
Hi, I am using bert to find semantic similarity using cosine distance, but it may lead to high dimension problem. Thankyou! |
Thanks for your interest. That's something I've been thinking about, but never found time to actually check. BERT embeddings are typically high-dimensional, so hubness might play a role. |
Thank you so much for the reply. If yes, we can use this, I mean whenever we generate embeddings we can check its intrinsic dimension if less, so less constraint it has, easier to fine-tune further, right? I would love to know your thoughts!! |
18 isn't particularly high, but we've seen datasets, where this came with high hubness (see e.g. p. 2885/6 of this previous paper. |
Approx. neighbor search with ngtpy can be accelerated:
The text was updated successfully, but these errors were encountered: