-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature suggestion: relative cosine similarity for word2vec #2175
Comments
Thanks for request @viplexke, I quickly looked at the article: IMO doesn't look very useful for including it to gensim. Also, the formula for CC: @gojomo @piskvorky |
It's interesting that it seems to help highlight synonyms, as opposed to other kinds of related words. If it could be done as a single short method in KeyedVectors, I think it'd be a good contribution – even though it's not hard for other to implement, sometimes people only discover new techniques by browing APIs. (Any implementation should cite this origin paper.) |
@gojomo @menshikh-iv How exactly would I go about implementing this "as a single short method in KeyedVectors" (as you say @gojomo ). Excuse the question - I'm very new to Gensim. Thanks! |
Section 3.5 of the referenced paper introduced the "relative cosine similarity" measure, which is essentially a measure of how-much-more-similar to word-A that word-B is, compared to the top-N-other-most-similar-to-word-A words. Essentially, it seems they observed that when one word was much "more similar" to a target word than the next N, it was especially likely to be a true synonym. (This "better than the others" was more reliable than any absolute cutoff of cosine-similarity; see the paper for the details and full reasoning.) So given two words and a top-n value, and a set of word-vectors, this new measure can be calculated. That naturally suggests a single new method on
A pull-request that implements this method, matching the definition in the paper, with an explanatory doc-comment (with link to the paper) and some tests (which manage to somewhat confirm expected behavior along the lines of that described in the paper) would be a useful contribution. |
@gojomo Thanks so much! I got it working now. |
|
@rawannasser Sure! How should I send it? |
@ailsamm |
@ailsamm Can you submit your implementation as a pull-request for potential integration to the project? |
Hi, |
@jenishah go ahead please :) |
If anyone is not working on this can I contribute? |
feel free to contribute @rsdel2007 |
@rsdel2007 yes, please |
Hi all,
Based on this paper, do you think it worths the effort to implement relative cosine similarity measure?
https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf
Note that I'm not suggesting this as a potential contributor but as a grateful user.
Thank you,
Viktor
The text was updated successfully, but these errors were encountered: