You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm on an M3 MacBook Pro
Python 3.12.4
scikit-learn 1.5.1
bertopic 0.16.3
numpy 1.26.4
scipy 1.14.0
fromsklearn.datasetsimportfetch_20newsgroupsfromsklearn.clusterimportMiniBatchKMeansfromsklearn.decompositionimportIncrementalPCAfrombertopic.vectorizersimportOnlineCountVectorizerfrombertopicimportBERTopic# Prepare documentsdocs=fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))["data"]
# Prepare sub-models that support online learningumap_model=IncrementalPCA(n_components=5)
cluster_model=MiniBatchKMeans(n_clusters=50, random_state=0)
vectorizer_model=OnlineCountVectorizer(stop_words="english", decay=.01)
topic_model=BERTopic(umap_model=umap_model,
hdbscan_model=cluster_model,
vectorizer_model=vectorizer_model)
# Incrementally fit the topic model by training on 1000 documents at a timeforindexinrange(0, len(docs), 1000):
topic_model.partial_fit(docs[index: index+1000])
BERTopic Version
0.16.3
The text was updated successfully, but these errors were encountered:
Hmmm, I have seen this issue with a recent scikit-learn update but it seems there isn't a fix as of yet. You could try the solution suggested here perhaps each time after a partial fit to see whether that helps.
Have you searched existing issues? 🔎
Desribe the bug
running the example code from the partial_fit example in the docs throws an error.
thanks for the help!
Stack trace:
Reproduction
copy and pasted from example code here
I'm on an M3 MacBook Pro
Python 3.12.4
scikit-learn 1.5.1
bertopic 0.16.3
numpy 1.26.4
scipy 1.14.0
BERTopic Version
0.16.3
The text was updated successfully, but these errors were encountered: