Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add summary index builder script #1552

Merged
merged 2 commits into from
Oct 4, 2023

Conversation

anticorrelator
Copy link
Contributor

@anticorrelator anticorrelator commented Oct 3, 2023

Adds the summary-embedding index builder to scripts.

This script persists a LlamaIndex VectorStoreIndex over our documentation. It loosely chunks over markdown sections, and uses the markdown section heirarchy and QuestionsAnsweredExtractor to generate metadata for each chunk. Embeddings are built on summaries of each chunk.

closes: #1503

TODO: automate uploading the index to GCS

@anticorrelator anticorrelator changed the title Add summary index builder script chore: Add summary index builder script Oct 3, 2023
@axiomofjoy
Copy link
Contributor

Nice, might also be worth taking a look at some of the loaders on LlamaHub, e.g., https://llamahub.ai/l/file-markdown

@anticorrelator anticorrelator merged commit 5b4f6ee into main Oct 4, 2023
5 checks passed
@anticorrelator anticorrelator deleted the dustin/docs-index-script branch October 4, 2023 13:31
@github-actions github-actions bot locked and limited conversation to collaborators Oct 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[onboarding] Create an index of the phoenix docs using a document loader
2 participants