This repository has been archived by the owner on Oct 30, 2024. It is now read-only.
feat: ingestion - include metadata from .knowledge.json on dir level #124
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ref #118
When ingesting a file or directory (recursively or not), we're now checking if there is a
.knowledge.json
file present in the directory. It's structured like this:This will add the defined k/v pairs as metadata to the documents in the vector store.
.knowledge.json
files in nested directories will be merged (with override) with parent metadata files.Notes
.knowledge.json
instead of.metadata.json
because I felt like the latter could be too "common" and we'd run into conflicts. By default, we're including hidden files in the ingestion process, so.knowledge.json
is not explicitly being ignored.metadata
entry so we can add additional fields for new features in the future, e.g. directory content descriptions, etc. which can be merged with dataset metadata for routing retrieval