Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make docs clearer on alpha parameter in LDA model #2896

Merged
merged 5 commits into from
Jul 26, 2020

Conversation

xh2
Copy link
Contributor

@xh2 xh2 commented Jul 24, 2020

Summary

Modified docstring on alpha in LDA model around 'symmetric' and 'asymmetric' options

Motivation

  • When I first read the doc, I saw by default alpha='symmetric', but in the docs, only 'asymmetric' and 'auto' are listed as acceptable strings, so my first instinct was it was a typo. To be clearer, especially to new users, I think we should spell out all possible options, including the default 'symmetric' in the doc
  • The current 'asymmetric' description doesn't seem to fit the actual formula in the code

@piskvorky
Copy link
Owner

You're right, thanks for the fix!

The current 'asymmetric' description doesn't seem to fit the actual formula in the code

Which is correct – the code or the documentation? I don't remember the motivation or original paper for the asymmetric prior any more, unfortunately.

@xh2
Copy link
Contributor Author

xh2 commented Jul 24, 2020

Which is correct – the code or the documentation? I don't remember the motivation or original paper for the asymmetric prior any more, unfortunately.

The paper by Hoffman only used symmetric priors, while citing Wallach, Hanna & Mimno, David & Mccallum, Andrew. (2009). Rethinking LDA: Why priors matter. NIPS. 23. 1973-1981. for asymmetric priors. But I couldn't find either formula in that paper either, so not really sure what it is supposed to be.

The existing doc on asymmetric alpha, however, is actually suggesting a symmetric distribution instead? No? That was also what confused me and made me think it was a typo.

@piskvorky
Copy link
Owner

I don't think so – with asymmetric, different topics get a different alpha. With symmatric, all topics get the same alpha.

@xh2
Copy link
Contributor Author

xh2 commented Jul 24, 2020

yes agree! but isn't 1.0 / topicno (current docstring description for asymmetric) uniform? (the actual code is indeed asymmetric). unless here topicno means index not number of topics?

@piskvorky
Copy link
Owner

Yes. For sure that's meant as "index of the topic", not "total number of topics". Another piece of documentation that could use a fix!

gensim/models/ldamodel.py Outdated Show resolved Hide resolved
gensim/models/ldamodel.py Outdated Show resolved Hide resolved
@piskvorky piskvorky changed the title Make docs clearer on alpha parameter in LDA cmodel Make docs clearer on alpha parameter in LDA model Jul 26, 2020
@piskvorky piskvorky merged commit a662e8d into piskvorky:develop Jul 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants