Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Vector Storage does not allow to specify document id indexing logic #19

Open
HQarroum opened this issue Feb 9, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request existing-middleware A label associated with an existing middleware. triage

Comments

@HQarroum
Copy link
Contributor

HQarroum commented Feb 9, 2024

Use case

Today, the vector storage connector uses the document url or the chunk identifier (if the document is a chunk) to provide a document identifier to OpenSearch when indexing the document. This a problem for documents that change often as this can lead to a duplication of modified chunks in the OpenSearch storage.

Solution/User Experience

Provide a way for end-users to define how they want the vector storage connector to index documents (e.g append-only, or a potential removal of previous chunks before insertion).

Alternative solutions

No response

@HQarroum HQarroum added enhancement New feature or request triage existing-middleware A label associated with an existing middleware. labels Feb 9, 2024
@HQarroum HQarroum self-assigned this Feb 9, 2024
@HQarroum HQarroum moved this to In review in Project Lakechain Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request existing-middleware A label associated with an existing middleware. triage
Projects
Status: In review
Development

No branches or pull requests

1 participant