-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to Dynamic Topic Models #840
Comments
I have been going through the code of ldasemodel.py and had following questions:
|
#903 answers my question. |
Yes, @mjawa , you're absolutely right with both points 1 and 2. I think the steps ahead, in order, would be to first implement DIM, and then work on making things faster by integrating |
@bhargavvader : I see. I can start implementing DIM. I am new to gensim, can you please tell me some of the reasons why lda_post was used instead of ldamodel to begin with. |
Mainly because I wanted to be sure to replicate the C code as much as possible to allow for easier testing and the option of including DIM. DIM uses methods in the It'll need some investigation to see to what extent |
Dynamic Topic Models is a variation of LDA by Blei et al which takes time-tagged data and allows one to Topic Model data over time-periods. I have described it more in a series of blogs here, and this is the PR (#739) recently merged which implements it.
While the code is functionally correct, it could use some more work to make it even better.
Some of the things which would be very useful for the same:
- in particular, a lot of DIM depends on LdaPost being in place.
- in particular,
update_obs
and the optimization takes a lot time.What is also very useful is suggesting how the code can be made more user friendly, or alternate ways to take data as an input (for example, a dict or tuple such as
{data/document : time-stamp}
), and posting examples and results from training DTM on datasets.PRs to implement any of the suggestions or issues on improving performance would be particularly useful.
@piskvorky , @tmylk , could you add the
feature
label and whatever else would be appropriate so this is easier to find when someone wishes to help?The text was updated successfully, but these errors were encountered: