Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Auto-Rebuild of LSI Indexes #10

Open
tra38 opened this issue Jul 1, 2018 · 0 comments
Open

Avoid Auto-Rebuild of LSI Indexes #10

tra38 opened this issue Jul 1, 2018 · 0 comments

Comments

@tra38
Copy link
Owner

tra38 commented Jul 1, 2018

ClassifierReborn::Summarizer.perform_lsi uses a neat trick to avoid LSI's default behavior of "auto-rebuilding" the index after every time a person adds new content .

def perform_lsi(chunks, count, separator)
    lsi = ClassifierReborn::LSI.new auto_rebuild: false
    chunks.each { \|chunk\| lsi << chunk unless chunk.strip.empty? \|\| chunk.strip.split.size == 1 }
    lsi.build_index
    summaries = lsi.highest_relative_content count
    summaries.reject { \|chunk\| !summaries.include? chunk }.map(&:strip).join(separator)
end

This could be useful behavior to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant