Make docs clearer on `alpha` parameter in LDA model #2896

xh2 · 2020-07-24T13:19:20Z

Summary

Modified docstring on alpha in LDA model around 'symmetric' and 'asymmetric' options

Motivation

When I first read the doc, I saw by default alpha='symmetric', but in the docs, only 'asymmetric' and 'auto' are listed as acceptable strings, so my first instinct was it was a typo. To be clearer, especially to new users, I think we should spell out all possible options, including the default 'symmetric' in the doc
The current 'asymmetric' description doesn't seem to fit the actual formula in the code

Make docs clearer on `alpha` parameter in LDA model

piskvorky · 2020-07-24T13:25:31Z

You're right, thanks for the fix!

The current 'asymmetric' description doesn't seem to fit the actual formula in the code

Which is correct – the code or the documentation? I don't remember the motivation or original paper for the asymmetric prior any more, unfortunately.

xh2 · 2020-07-24T14:15:10Z

Which is correct – the code or the documentation? I don't remember the motivation or original paper for the asymmetric prior any more, unfortunately.

The paper by Hoffman only used symmetric priors, while citing Wallach, Hanna & Mimno, David & Mccallum, Andrew. (2009). Rethinking LDA: Why priors matter. NIPS. 23. 1973-1981. for asymmetric priors. But I couldn't find either formula in that paper either, so not really sure what it is supposed to be.

The existing doc on asymmetric alpha, however, is actually suggesting a symmetric distribution instead? No? That was also what confused me and made me think it was a typo.

piskvorky · 2020-07-24T14:51:55Z

I don't think so – with asymmetric, different topics get a different alpha. With symmatric, all topics get the same alpha.

xh2 · 2020-07-24T15:03:53Z

yes agree! but isn't 1.0 / topicno (current docstring description for asymmetric) uniform? (the actual code is indeed asymmetric). unless here topicno means index not number of topics?

piskvorky · 2020-07-24T15:18:43Z

Yes. For sure that's meant as "index of the topic", not "total number of topics". Another piece of documentation that could use a fix!

gensim/models/ldamodel.py

xh2 added 2 commits July 24, 2020 14:09

Make docs clearer on alpha parameter in LDA model

03c8bb9

Merge pull request #1 from xh2/patch-1

7791b74

Make docs clearer on `alpha` parameter in LDA model

rm whitespace

25005c5

piskvorky requested changes Jul 26, 2020

View reviewed changes

gensim/models/ldamodel.py Outdated Show resolved Hide resolved

gensim/models/ldamodel.py Outdated Show resolved Hide resolved

piskvorky added 2 commits July 26, 2020 11:01

Update gensim/models/ldamodel.py

f34956c

Update gensim/models/ldamodel.py

7d0ef9e

piskvorky approved these changes Jul 26, 2020

View reviewed changes

piskvorky changed the title ~~Make docs clearer on alpha parameter in LDA cmodel~~ Make docs clearer on alpha parameter in LDA model Jul 26, 2020

piskvorky merged commit a662e8d into piskvorky:develop Jul 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make docs clearer on `alpha` parameter in LDA model #2896

Make docs clearer on `alpha` parameter in LDA model #2896

xh2 commented Jul 24, 2020

piskvorky commented Jul 24, 2020

xh2 commented Jul 24, 2020

piskvorky commented Jul 24, 2020

xh2 commented Jul 24, 2020

piskvorky commented Jul 24, 2020

Make docs clearer on alpha parameter in LDA model #2896

Make docs clearer on alpha parameter in LDA model #2896

Conversation

xh2 commented Jul 24, 2020

Summary

Motivation

piskvorky commented Jul 24, 2020

xh2 commented Jul 24, 2020

piskvorky commented Jul 24, 2020

xh2 commented Jul 24, 2020

piskvorky commented Jul 24, 2020

Make docs clearer on `alpha` parameter in LDA model #2896

Make docs clearer on `alpha` parameter in LDA model #2896