You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I am using Gensim to compute the NPMI coherence for each of my topics. I used the method get_coherence_per_topic() and also get_coherence() (in this case, just passing a list with a single topic), and I noticed that the coherences per topic do not match with the ones returned by get_coherence() of the corresponding topics. In my understanding, the NPMI of a topic should be independent of the number of topics or of the other input topics.
This happens also with the other c_* coherences, not with the UMASS version.
Thank you!
Steps/code/corpus to reproduce
fromgensim.test.utilsimportcommon_texts, common_dictionaryfromgensim.models.ldamodelimportLdaModelfromgensim.models.coherencemodelimportCoherenceModeltopics= [
['human', 'computer', 'system', 'interface'],
['graph', 'minors', 'trees', 'eps']
]
cm=CoherenceModel(topics=topics, texts=common_texts, coherence='c_npmi',
dictionary=common_dictionary)
coherence=cm.get_coherence_per_topic()
print(coherence) # got [0.23583958321789514, -0.24456941091456053]cm_topic0=CoherenceModel(topics=[topics[0]], texts=common_texts,
coherence='c_npmi', dictionary=common_dictionary)
coherence_topic0=cm_topic0.get_coherence()
print(coherence_topic0) # expect this to be == coherence[0] but got -0.14624062517782566cm_topic1=CoherenceModel(topics=[topics[1]], texts=common_texts,
coherence='c_npmi', dictionary=common_dictionary)
coherence_topic1=cm_topic1.get_coherence()
print(coherence_topic1) # expect this to be == coherence[1] but got -0.31633310918174923
Problem description
Hi! I am using Gensim to compute the NPMI coherence for each of my topics. I used the method
get_coherence_per_topic()
and alsoget_coherence()
(in this case, just passing a list with a single topic), and I noticed that the coherences per topic do not match with the ones returned byget_coherence()
of the corresponding topics. In my understanding, the NPMI of a topic should be independent of the number of topics or of the other input topics.This happens also with the other c_* coherences, not with the UMASS version.
Thank you!
Steps/code/corpus to reproduce
Versions
Linux-5.4.104+-x86_64-with-Ubuntu-18.04-bionic
Python 3.7.10 (default, May 3 2021, 02:48:31)
[GCC 7.5.0]
Bits 64
NumPy 1.19.5
SciPy 1.4.1
gensim 3.8.3
FAST_VERSION 1
The text was updated successfully, but these errors were encountered: