Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify zero-shot topic modeling #2060

Merged
merged 16 commits into from
Jul 1, 2024

Commits on May 28, 2024

  1. Zeroshot fixes (#2)

    - zero-shot topic modeling is now only the equivalent of a clustering step
      - removed implementation where this functionality is done through merging two models
      - all documents are used at once when calculating representations
      - probability comes from cosine similarity when zeroshot topics are used
    - validate `nr_topics` with respect to how many zero-shot topics matched
    - track `self._outliers` and `self.topic_labels_` using `@property`, as they are derivatives of other attributes
    - validate existence of outliers before outlier reduction
    ianrandman committed May 28, 2024
    Configuration menu
    Copy the full SHA
    b95de5d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    96b1d6f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    69fa29c View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2024

  1. Configuration menu
    Copy the full SHA
    fc12e03 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    02134d6 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2024

  1. Configuration menu
    Copy the full SHA
    cfd75a4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ecd0224 View commit details
    Browse the repository at this point in the history
  3. When merging zero-shot, keep single zero-shot label if meets threshol…

    …d with new topic embedding (#2)
    ianrandman committed Jun 18, 2024
    Configuration menu
    Copy the full SHA
    19af331 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9155a87 View commit details
    Browse the repository at this point in the history
  5. Fix typo (#2)

    ianrandman committed Jun 18, 2024
    Configuration menu
    Copy the full SHA
    2a7b194 View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2024

  1. Configuration menu
    Copy the full SHA
    0f16c3d View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2024

  1. Merge branch 'master' into issue2-simplify-zero-shot

    # Conflicts:
    #	bertopic/_bertopic.py
    ianrandman committed Jun 23, 2024
    Configuration menu
    Copy the full SHA
    cecb683 View commit details
    Browse the repository at this point in the history
  2. Format using ruff (#2)

    ianrandman committed Jun 23, 2024
    Configuration menu
    Copy the full SHA
    7766277 View commit details
    Browse the repository at this point in the history
  3. Make self._topic_id_to_zeroshot_topic_idx private, add comments/doc…

    …strings, lower threshold zeroshot test, fix outliers for probabilities during zeroshot (#2)
    ianrandman committed Jun 23, 2024
    Configuration menu
    Copy the full SHA
    fbc574b View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2024

  1. Ruff lint fixes (#2)

    ianrandman committed Jun 26, 2024
    Configuration menu
    Copy the full SHA
    e812366 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2024

  1. Configuration menu
    Copy the full SHA
    e093b5b View commit details
    Browse the repository at this point in the history