Feature suggestion: relative cosine similarity for word2vec #2175

viplexke · 2018-09-07T15:28:01Z

Hi all,
Based on this paper, do you think it worths the effort to implement relative cosine similarity measure?
https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf

Note that I'm not suggesting this as a potential contributor but as a grateful user.
Thank you,
Viktor

menshikh-iv · 2018-09-10T03:47:20Z

Thanks for request @viplexke, I quickly looked at the article: IMO doesn't look very useful for including it to gensim. Also, the formula for relative cosine similarity looks pretty simple (i.e. any person who need this can implement it self).

CC: @gojomo @piskvorky

gojomo · 2018-09-17T19:37:53Z

It's interesting that it seems to help highlight synonyms, as opposed to other kinds of related words. If it could be done as a single short method in KeyedVectors, I think it'd be a good contribution – even though it's not hard for other to implement, sometimes people only discover new techniques by browing APIs. (Any implementation should cite this origin paper.)

ailsamm · 2018-10-31T13:33:54Z

@gojomo @menshikh-iv How exactly would I go about implementing this "as a single short method in KeyedVectors" (as you say @gojomo ). Excuse the question - I'm very new to Gensim. Thanks!

gojomo · 2018-10-31T21:26:07Z

Section 3.5 of the referenced paper introduced the "relative cosine similarity" measure, which is essentially a measure of how-much-more-similar to word-A that word-B is, compared to the top-N-other-most-similar-to-word-A words. Essentially, it seems they observed that when one word was much "more similar" to a target word than the next N, it was especially likely to be a true synonym. (This "better than the others" was more reliable than any absolute cutoff of cosine-similarity; see the paper for the details and full reasoning.)

So given two words and a top-n value, and a set of word-vectors, this new measure can be calculated. That naturally suggests a single new method on KeyedVectors with a signature like:

def relative_cosine_similarity(word_a, word_b, topn=10):
    ...

A pull-request that implements this method, matching the definition in the paper, with an explanatory doc-comment (with link to the paper) and some tests (which manage to somewhat confirm expected behavior along the lines of that described in the paper) would be a useful contribution.

ailsamm · 2018-11-01T20:04:58Z

@gojomo Thanks so much! I got it working now.

rawannasser · 2018-11-12T15:13:12Z

Hi dear @ailsamm I need to implement the same measure, can you provide me the code please since it works with you
I really need it :(

ailsamm · 2018-11-13T09:19:07Z

@rawannasser Sure! How should I send it?

rawannasser · 2018-11-13T17:02:54Z

@ailsamm
Thanks so much!
I don't know if I can write my email here but maybe you can upload it in your GitHub?

gojomo · 2018-11-15T22:07:13Z

@ailsamm Can you submit your implementation as a pull-request for potential integration to the project?

jenishah · 2018-12-05T04:42:51Z

Hi,
Is anyone working on this?
If not, I would like to take this up.

piskvorky · 2018-12-05T08:59:25Z

@jenishah go ahead please :)

rsdel2007 · 2018-12-18T11:30:09Z

If anyone is not working on this can I contribute?

menshikh-iv · 2018-12-19T04:30:06Z

feel free to contribute @rsdel2007

rawannasser · 2018-12-19T04:32:38Z

@rsdel2007 yes, please

menshikh-iv added the feature Issue described a new feature label Sep 10, 2018

menshikh-iv closed this as completed Sep 10, 2018

menshikh-iv reopened this Sep 10, 2018

menshikh-iv added the difficulty easy Easy issue: required small fix label Sep 18, 2018

rsdel2007 mentioned this issue Dec 23, 2018

Added Function relative_cosine_similarity in keyedvectors.py #2307

Merged

menshikh-iv closed this as completed in #2307 Jan 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature suggestion: relative cosine similarity for word2vec #2175

Feature suggestion: relative cosine similarity for word2vec #2175

viplexke commented Sep 7, 2018 •

edited

Loading

menshikh-iv commented Sep 10, 2018

gojomo commented Sep 17, 2018

ailsamm commented Oct 31, 2018

gojomo commented Oct 31, 2018 •

edited

Loading

ailsamm commented Nov 1, 2018

rawannasser commented Nov 12, 2018

ailsamm commented Nov 13, 2018

rawannasser commented Nov 13, 2018

gojomo commented Nov 15, 2018

jenishah commented Dec 5, 2018

piskvorky commented Dec 5, 2018

rsdel2007 commented Dec 18, 2018

menshikh-iv commented Dec 19, 2018

rawannasser commented Dec 19, 2018

Feature suggestion: relative cosine similarity for word2vec #2175

Feature suggestion: relative cosine similarity for word2vec #2175

Comments

viplexke commented Sep 7, 2018 • edited Loading

menshikh-iv commented Sep 10, 2018

gojomo commented Sep 17, 2018

ailsamm commented Oct 31, 2018

gojomo commented Oct 31, 2018 • edited Loading

ailsamm commented Nov 1, 2018

rawannasser commented Nov 12, 2018

ailsamm commented Nov 13, 2018

rawannasser commented Nov 13, 2018

gojomo commented Nov 15, 2018

jenishah commented Dec 5, 2018

piskvorky commented Dec 5, 2018

rsdel2007 commented Dec 18, 2018

menshikh-iv commented Dec 19, 2018

rawannasser commented Dec 19, 2018

viplexke commented Sep 7, 2018 •

edited

Loading

gojomo commented Oct 31, 2018 •

edited

Loading