Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hierarchical_topics: unused variable cluster_df significantly slowing down performance #1698

Closed
shadiakiki1986 opened this issue Dec 15, 2023 · 1 comment · Fixed by #1701

Comments

@shadiakiki1986
Copy link
Contributor

The variable cluster_df seems not to be used anywhere in the function: https://github.com/MaartenGr/BERTopic/blob/master/bertopic/_bertopic.py#L1008

The groupby call in it is the slowest line in the function too.

Maybe better remove it

@MaartenGr
Copy link
Owner

Thanks! Should be indeed removed. If you want, a PR is appreciated. Otherwise, I'll put it on the list for next year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants