gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models #1200

prakhar2b · 2017-03-09T15:37:38Z

show_topic parameter num_words changed to topn in order to make it consistent with LdaModel. Fix #1198

prakhar2b · 2017-03-09T18:23:45Z

@tmylk To standerize the api and bring consistency to the topic models with respect to LdaModel, following parameters need to be used, as per my understanding-

show_topics / print_topics - num_words
show_topic / print_topic - topn

According to above, there are still more inconsistencies in other topic models hdpmodel, dtmmodel, ldavowpalwabbit.

Please confirm it and I'll make changes accordingly.

piskvorky · 2017-03-10T11:53:24Z

I believe we should support the old param too, perhaps with some deprecation warning.

Once we remove the existing params, we'll have to up the major version (gensim 2.0), because we switched to semantic versioning.

Without a clearly defined "public API" (and the Python philosophy doesn't care much for that), we'll probably be bumping the major version a lot.

prakhar2b · 2017-03-10T13:56:27Z

@piskvorky OK. With reference to the current API, I'll add support to the consistent param with a deprecation warning for the old one without removing it.

For example, in hdp.show_topic, the current API suggests-

 show_topic(self,topic_id, num_words=20, log=False, formatted=False)

After change, it will be something like this

show_topic(self,topic_id, num_words=20, topn=20, log=False, formatted=False)
#deprecation warning
if topn is 20 and num_words is not 20 :       #old param num_words is used
    logger.warning("num_words is deprecated in the updated version. Please use topn.")

I will update this PR for all models accordingly as soon as possible.

tmylk · 2017-03-13T13:46:49Z

What is the reason for closing this PR? you can just keep working in this branch

prakhar2b · 2017-03-16T13:10:30Z

@tmylk unrelated checks fail .

tmylk · 2017-03-17T00:39:41Z

Travis tests re-ran after smart_open update

show_topic parameter num_words changed to topn in order to make it consistent with LdaModel show_topic parameter num_words changed to topn both old and new param with deprecation warning ldamallet now supports both num_words and topn parameters for show_topic with deprecation warning for the num_words. hdpmodel show_topic supports old and new param show_topic in hdpmodel now supports both num_words and topn parameters to make it consistent across all models, with deprecation warning for num_words dtmmodel topn/num_words with deprecation warning Inconsistency between api and code removed for topn/num_words by adding support for both params with proper deprecation warning hdpmodel show_topic supports old and new param show_topic in hdpmodel now supports both num_words and topn parameters to make it consistent across all models, with deprecation warning for num_words - checks should pass this time hdpmodel show_topic supports old and new para dtmmodel topn/num_words with deprecation warning ldamallet show_topic param fixed ldamallet now supports both num_words and topn parameters for show_topic with deprecation warning for the num_words. dtmmodel topn/num_words with deprecation warning dtmmodel is now compatible with both topn/num_words parameters for show_topic and others with proper deprecation warnings. hdpmodel num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words hdpmodel num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words hdpmodel num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words dtmmodel num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words ldamallet num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words hdpmodel num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words ldamallet num_words changed to topn with deprecation warning To make the code consistent with the api- parameters num_words changed to topn (for print_topic/show_topic method), with deprecation warning for num_words

prakhar2b · 2017-03-19T23:37:38Z

@tmylk Squashed all commits into one.

Note : With reference to the API, following parameters have been standerized across models-

show_topics/ print_topics - num_words
show_topic/ print_topic - topn

As suggested in the above comment, old comment is still supported as of now, and proper deprecation warning has been added for num_words appropriately to keep the API relevant.

tmylk

Please change the logic of the warnings.

tmylk · 2017-04-10T20:54:38Z

gensim/models/hdpmodel.py

@@ -445,11 +444,17 @@ def show_topic(self, topic_id, num_words=20, log=False, formatted=False):
        `False` as lists of (weight, word) pairs.

        """
+        if topn is None: #deprecated num_words is used
+            logger.warn("The parameter num_words for show_topic() method would be deprecated in the updated version.\
+            Please use topn instead. Ignore if you didn't use parameter num_words or topn for show_topic() ")


what is the purpose of adding "ignore if"? Would it be better to make topn=20 by default and num_words=None. Show warning if num_words is not None and add a comment to make it an Exception in the next release. Same applies everywhere.

tmylk · 2017-04-10T20:55:14Z

gensim/models/hdpmodel.py

-        return self.show_topic(topic_id, num_words, formatted=True)
-
-    def show_topic(self, topic_id, num_words, log=False, formatted=False):
+    def print_topic(self, topic_id,topn= None, num_words=20):


why add a default value here?

tmylk · 2017-05-02T19:37:18Z

Ping @prakhar2b

prakhar2b · 2017-05-02T21:32:11Z

yes, on this now. Thanks

prakhar2b · 2017-05-02T22:52:38Z

@tmylk updated the PR. Thanks for the review comments.

piskvorky · 2017-05-14T13:07:50Z

gensim/models/wrappers/dtmmodel.py

        """
        Return `num_words` most probable words for the given `topicid`, as a list of
        `(word_probability, word)` 2-tuples.

        """
+        if num_words is not None:  # deprecated num_words is used
+            logger.warn("The parameter num_words for show_topic() method would be deprecated in the updated version.\


This would include the whitespace after \ in the mesage.

It's better to split multi-line strings using "abc" "dce" (two strings next to each other, on different lines).

piskvorky · 2017-05-14T13:07:58Z

gensim/models/wrappers/ldamallet.py

-    def show_topic(self, topicid, num_words=10):
+    def show_topic(self, topicid, topn=10, num_words=None):
+        if num_words is not None:  # deprecated num_words is used
+            logger.warn("The parameter num_words for show_topic() method would be deprecated in the updated version.\


prakhar2b · 2017-05-20T02:25:12Z

cc @piskvorky updated the PR according to review

tmylk · 2017-05-23T22:15:21Z

Note: this is backwards compatible.

piskvorky · 2017-05-27T17:18:42Z

gensim/models/hdpmodel.py

-        return self.show_topic(topic_id, num_words, formatted=True)
+    def print_topic(self, topic_id, topn= None, num_words=None):
+        if num_words is not None:  # deprecated num_words is used
+            logger.warning("The parameter num_words for print_topic() would be deprecated in the updated version.")


Should be warnings.warn, not a logging message (will spam logs).

piskvorky · 2017-05-27T17:19:20Z

gensim/models/hdpmodel.py

+    def print_topic(self, topic_id, topn= None, num_words=None):
+        if num_words is not None:  # deprecated num_words is used
+            logger.warning("The parameter num_words for print_topic() would be deprecated in the updated version.")
+            logger.warning("Please use topn instead.")


No need for two messages, one warning is enough (concatenate the messages).

piskvorky · 2017-05-27T17:19:26Z

gensim/models/hdpmodel.py

-    def show_topic(self, topic_id, num_words, log=False, formatted=False):
+    def show_topic(self, topic_id, topn=20, log=False, formatted=False, num_words= None,):
+        if num_words is not None:  # deprecated num_words is used
+            logger.warning("The parameter num_words for show_topic() would be deprecated in the updated version.")


piskvorky · 2017-05-27T17:19:49Z

gensim/models/wrappers/dtmmodel.py

        """Return the given topic, formatted as a string."""
-        return ' + '.join(['%.3f*%s' % v for v in self.show_topic(topicid, time, num_words)])
+        if num_words is not None:  # deprecated num_words is used
+            logger.warning("The parameter num_words for print_topic(() would be deprecated in the updated version.")


dtto.

Also, too many opening brackets (().

tmylk added 27 commits November 5, 2015 19:07

Merge branch 'release-0.12.3rc1'

1c63c9a

Merge branch 'release-0.12.3'

280a488

Merge branch 'release-0.12.3'

ddeb002

Update CHANGELOG.txt

f2ac3a9

Update CHANGELOG.txt

cf09e8c

resolve merge conflict in Changelog

b61287a

Merge branch 'release-0.12.4' with piskvorky#596

3ade404

Merge branch 'release-0.13.0'

9e6522e

Merge branch 'release-0.13.0'

87c4e9c

Release version typo fix

9c74b40

Merge branch 'release-0.13.0rc1'

7b30025

Merge branch 'release-0.13.0'

de79c8e

Merge branch 'release-0.13.1'

d4f9cc5

Merge branch 'release-0.13.2'

d8e9c0f

Merge branch 'release-0.13.2'

7c118fc

Merge branch 'release-0.13.3'

432f840

Merge branch 'release-0.13.3'

b42e181

Win and OSX build fix

3067cb0

Merge branch 'release-0.13.4'

e838391

Merge branch 'release-0.13.4.1'

5d47ec4

Merge branch 'release-1.0.0rc1'

a18de8d

Typo in version

67b1a17

Fix merge conflict

df13670

Merge branch 'release-1.0.0'

78da89a

Merge branch 'release-1.0.1'

fb3f303

Merge branch 'release-1.0.1'

adc447d

Merge branch 'release-1.0.1'

333fd4d

prakhar2b closed this Mar 13, 2017

prakhar2b reopened this Mar 16, 2017

prakhar2b changed the title ~~LdaMallet show_topic parameter num_words changed to topn to match other topic models~~ LdaMallet show_topic/print_topic parameter num_words changed to topn to match other topic models Mar 16, 2017

prakhar2b changed the title ~~LdaMallet show_topic/print_topic parameter num_words changed to topn to match other topic models~~ gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models Mar 16, 2017

prakhar2b force-pushed the prakhar_2017 branch from af22736 to 61dc832 Compare March 19, 2017 23:24

tmylk suggested changes Apr 10, 2017

View reviewed changes

prakhar2b added 5 commits May 3, 2017 03:19

hdpmodel topn/num_words conflict resolved

5f71f66

dtmmodel topn/show_topic conflict resolved

b3d210c

ldamallet topn/num_words conflict resolved

c7f9824

whitespace error resolved

42ac76d

whitespace error resolved

113a0af

piskvorky requested changes May 14, 2017

View reviewed changes

prakhar2b added 3 commits May 20, 2017 07:50

split multi-line comments in hdpmodel

51683f1

splitting multi-line comments in dtmmodel

29ab15f

splitting multi-line comments for ldamallet

f949ce6

tmylk added breaks backward-compatibility Change breaks backward compatibility and removed breaks backward-compatibility Change breaks backward compatibility labels May 23, 2017

tmylk merged commit 834e130 into piskvorky:develop May 23, 2017

piskvorky reviewed May 27, 2017

View reviewed changes

prakhar2b mentioned this pull request Jun 22, 2017

Loading fastText models using only bin file #1341

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models #1200

gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models #1200

prakhar2b commented Mar 9, 2017 •

edited

Loading

prakhar2b commented Mar 9, 2017

piskvorky commented Mar 10, 2017 •

edited

Loading

prakhar2b commented Mar 10, 2017 •

edited

Loading

tmylk commented Mar 13, 2017

prakhar2b commented Mar 16, 2017

tmylk commented Mar 17, 2017

prakhar2b commented Mar 19, 2017 •

edited

Loading

tmylk left a comment

tmylk Apr 10, 2017

tmylk Apr 10, 2017

tmylk commented May 2, 2017

prakhar2b commented May 2, 2017

prakhar2b commented May 2, 2017

piskvorky May 14, 2017

piskvorky May 14, 2017

prakhar2b commented May 20, 2017

tmylk commented May 23, 2017

piskvorky May 27, 2017

piskvorky May 27, 2017

piskvorky May 27, 2017

piskvorky May 27, 2017 •

edited

Loading

gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models #1200

gensim models show_topic/print_topic parameter num_words changed to topn to match other topic models #1200

Conversation

prakhar2b commented Mar 9, 2017 • edited Loading

prakhar2b commented Mar 9, 2017

piskvorky commented Mar 10, 2017 • edited Loading

prakhar2b commented Mar 10, 2017 • edited Loading

I will update this PR for all models accordingly as soon as possible.

tmylk commented Mar 13, 2017

prakhar2b commented Mar 16, 2017

tmylk commented Mar 17, 2017

prakhar2b commented Mar 19, 2017 • edited Loading

tmylk left a comment

Choose a reason for hiding this comment

tmylk Apr 10, 2017

Choose a reason for hiding this comment

tmylk Apr 10, 2017

Choose a reason for hiding this comment

tmylk commented May 2, 2017

prakhar2b commented May 2, 2017

prakhar2b commented May 2, 2017

piskvorky May 14, 2017

Choose a reason for hiding this comment

piskvorky May 14, 2017

Choose a reason for hiding this comment

prakhar2b commented May 20, 2017

tmylk commented May 23, 2017

piskvorky May 27, 2017

Choose a reason for hiding this comment

piskvorky May 27, 2017

Choose a reason for hiding this comment

piskvorky May 27, 2017

Choose a reason for hiding this comment

piskvorky May 27, 2017 • edited Loading

Choose a reason for hiding this comment

prakhar2b commented Mar 9, 2017 •

edited

Loading

piskvorky commented Mar 10, 2017 •

edited

Loading

prakhar2b commented Mar 10, 2017 •

edited

Loading

prakhar2b commented Mar 19, 2017 •

edited

Loading

piskvorky May 27, 2017 •

edited

Loading