Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes some issues in LDA tutorial and adds some enhancements to make the result better.
Fixes
to
to
because log-normal cannot be used to approximate Dirichlet. The reference also talked about logistic-normal, not log-normal.
bias=False
toDecoder.beta
to match the discussion:wn|β,θ∼Categorical(σ(βθ))
. Otherwise, we should change the discussion text town|β,θ∼Categorical(σ(βθ + bias))
.bias=False
matches the behavior of the original implementation of ProdLDA.total_count
argument and removeto_event(1)
atUsing
to_event(1)
here will give us a wrong model (Multinomal already has event_shape=1). Empirically, in the tutorial,epoch_loss=1.12e+07
while after the fixepoch_loss=3.72e+05
.logtheta
, rather thantheta
.Enhancements
affine=False
in BatchNorm1d: I got no luck withaffine=True
. The inference seems overfitting with those extra parameters of affine=True and the result topics do not make much sense.stop_words='english'
at cell 7 seems to help. The number of unique words is reduced from12999
to12722
and the words his/he/was/... are removed, which is a nice preprocessing improvement IMO.Result
According to the notebook, the word cloud topics are more coherent than the current one. IMO, the result is pretty good now. :) cc @ucals