Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flsamodel #3398

Merged
merged 6 commits into from
Dec 12, 2022
Merged

Flsamodel #3398

merged 6 commits into from
Dec 12, 2022

Conversation

ERijck
Copy link
Contributor

@ERijck ERijck commented Nov 1, 2022

I have added flsamodel, which includes the topic modeling algorithms FLSA, FLSA-W and FLSA-E. In experimental results, FLSA-W has outperformed other state of the art algorithms on various open datasets (e.g. LDA, LSI, NMF, ProdLDA).

Motivation:
Since Gensim features various state-of-the-art topic modeling algorithms, and my group's algorithms outperform these algorithms in terms of coherence-, diversity- and interpretability score, we believe our algorithms should be featured in Gensim too. Previously, I created a wrapper function that depended on FuzzyTM. In this PR, the algorithms are trained within Gensim.

People can use this code similarly to how LDAmodel is being used. For the 'corpus' the following datatypes are allowed:

  • list of list of str.
  • list of list of tuples (int, int) (bow).

The algorithms have been featured in various scientific publications. See the links below:

FLSA-W:
Rijcken, E., Scheepers, F., Mosteiro, P., Zervanou, K., Spruit, M., & Kaymak, U. (2021, December). A comparative study of fuzzy topic models and lda in terms of interpretability. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1-8). IEEE.

FLSA-E:
Rijcken, E., Zervanou, K., Spruit, M., Mosteiro, P., Scheepers, F., & Kaymak, U. (2022). Exploring Embedding Spaces for more Coherent Topic Modeling in Electronic Health Records. In IEEE International Conference on Systems, Man, and Cybernetics.

FLSA:
Karami, Amir, et al. "Fuzzy approach topic discovery in health and medical corpora." International Journal of Fuzzy Systems 20.4 (2018): 1334-1345.

These algorithms are featured in the FuzzyTM package:
Rijcken, E., Mosteiro, P., Zervanou, K., Spruit, M., Scheepers, F., & Kaymak, U. (2022, July). FuzzyTM: a software package for fuzzy topic modeling. In 2022 IEEE international conference on fuzzy systems (FUZZ-IEEE) (pp. 1-8). IEEE.

Experimental results:

@piskvorky piskvorky added this to the Next release milestone Nov 19, 2022
@mpenkov mpenkov merged commit 45d35ee into piskvorky:develop Dec 12, 2022
@mpenkov
Copy link
Collaborator

mpenkov commented Dec 12, 2022

Thank you for contribution @ERijck !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants