You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello @MaartenGr i got issue after trying to get multiple topic for each document,
i running this script, topic_distr, topic_token_distr = topic_model.approximate_distribution(news_content[:3], calculate_tokens=True)
and i got this error
[/usr/local/lib/python3.8/dist-packages/bertopic/_bertopic.py](https://localhost:8080/#) in approximate_distribution(self, documents, window, stride, min_similarity, batch_size, padding, use_embedding_model, calculate_tokens, separator)
1096 sentences = [separator.join(token) for token in token_sets]
1097 all_sentences.extend(sentences)
-> 1098 all_token_sets_ids.extend(token_sets_ids)
1099 all_indices.append(all_indices[-1] + len(sentences))
1100
UnboundLocalError: local variable 'token_sets_ids' referenced before assignment
news_content contained list of long sentences.
The text was updated successfully, but these errors were encountered:
@alfandindarahmawan Great, thanks for finding this bug! I believe it should be fixed as it seems that was an issue handling documents that have fewer tokens than the window size.
Hello @MaartenGr i got issue after trying to get multiple topic for each document,
i running this script,
topic_distr, topic_token_distr = topic_model.approximate_distribution(news_content[:3], calculate_tokens=True)
and i got this error
news_content contained list of long sentences.
The text was updated successfully, but these errors were encountered: