Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support gensim4 LdaModel #73

Merged
merged 15 commits into from
Feb 5, 2024
Merged

Support gensim4 LdaModel #73

merged 15 commits into from
Feb 5, 2024

Conversation

larsgrobe
Copy link
Contributor

Resolves #70

The name of the module providing LDA model support has changed between versions 3 and 4 of gensim. Besides that, the call to LdaModel() had to be modified.

`LdaModel`, that was in `gensim.models.lda`(gensim3), is implemented by `gensim.models.ldamodel` now (gensim4). The proposed solution tries both and exits with an error if neither module can be imported.
Check wether version 3 or 4 of gensim is loaded and train the lda model accordingly.
Just added a missing space.
This error-handling may not comply with the general litstudy convention.
Convert the mayor version number to int as expected in the conditional.
Pass the num_topics parameter to LdaModel with gensim4.
Added the missing '.T'.
@isazi
Copy link
Member

isazi commented Jan 23, 2024

If you could run black and commit, that would fix the only failing test in the CI.

@isazi
Copy link
Member

isazi commented Feb 5, 2024

Thanks @larsgrobe , is the pull request ready to be merged at this point, or are you still working on it?

@larsgrobe
Copy link
Contributor Author

It should be ready, at least I am using the code as such. Note that I also added the "ensemble LDA model", which aims at getting more stable topics from the LDA algorithm. This would need documentation, the concept is decribed here:

BRIGL, Tobias, 2019, Extracting Reliable Topics using Ensemble Latent Dirichlet Allocation [Bachelor Thesis]. Technische Hochschule Ingolstadt. Munich: Data Reply GmbH. Supervised by Alex Loosley. Available from: https://www.sezanzeb.de/machine_learning/ensemble_LDA/

Best, Lars.

@larsgrobe larsgrobe closed this Feb 5, 2024
@stijnh stijnh reopened this Feb 5, 2024
@stijnh
Copy link
Member

stijnh commented Feb 5, 2024

Great! Thanks for your contributions. I am very happy to see support for the new ensemble LDA. I would like to have more topic modeling algorithms in litstudy that have been developed in recent years.

I have you have more ideas on topic modeling or are experimenting with other NLP processing techniques, do not hesitate to open a pull request, discussions, or issue to discuss your idea or your findings!

@stijnh stijnh merged commit ca7fc5d into NLeSC:master Feb 5, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incompability with gensim 4
3 participants