Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1797 and #1798 #1804

Merged
merged 2 commits into from
Feb 14, 2024
Merged

Fix #1797 and #1798 #1804

merged 2 commits into from
Feb 14, 2024

Conversation

MaartenGr
Copy link
Owner

A fix for #1797 and #1798. As mentioned in these links, there were 2 issues:

  • Embeddings not ordered correctly when using .merge_models
  • Outlier topic not in the 0th position when using zero-shot topic modeling causing prediction issues (amongst others)

The embeddings should now be ordered correctly and the outlier topic is not at the 0th position when using zero-shot topic modeling. This should make it possible to use zero-shot topic modeling with outlier reduction and other techniques.

Copy link

@timdhu timdhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested the two examples in #1797 and #1798, both output the correct results following these changes.

@@ -3663,6 +3663,32 @@ def _combine_zeroshot_topics(self,
self.__dict__.clear()
self.__dict__.update(merged_model.__dict__)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke too soon! merged_model.__dict__ replaces self.outliers with 0, so the outliers are never repositioned a couple of lines down.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just best to set outliers = self._outliers just before merging on line 3627, then using this on line 3668 instead?

Will also have to set self._outliers = 1 inside the if statement

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this out and it looks good to me

outliers

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for checking this out! I just pushed a commit which should finally resolve this issue. If you could do a final check, I can merge this PR.

Copy link

@timdhu timdhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Re-tested, the outlier is in the right position and the transform method is giving the right result

@MaartenGr MaartenGr merged commit 22df7bd into master Feb 14, 2024
2 checks passed
@MaartenGr
Copy link
Owner Author

@timdhu Thanks for the help fixing this issue. Your dedication to finding the underlying cause as well as testing/suggesting the needed changes are greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants