Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use query understanding for RAG retrieval #8

Open
3 of 5 tasks
bdb-dd opened this issue Oct 16, 2023 · 2 comments
Open
3 of 5 tasks

Use query understanding for RAG retrieval #8

bdb-dd opened this issue Oct 16, 2023 · 2 comments
Assignees
Labels
kind/feature-request New feature or request

Comments

@bdb-dd
Copy link
Collaborator

bdb-dd commented Oct 16, 2023

Description

Updated 26.02.24:
Even after having successfully addressed the retrieval ranking issues we had earlier, there are still many opportunities for improving retrieval for specific kinds of queries. As an example, a user may wish to qualify their search by defining specific filters, such as "updated recently", "sorted by version number" or "only open issues".

The "Query understand" strategy calls for using LLMs to generate the retrieval query itself, based on a combination of knowledge of the underlying search engine, the data schemas involved and potentially some function calling extensions.

Evaluate

Preview Give feedback

Additional Information

No response

@bdb-dd bdb-dd added the kind/feature-request New feature or request label Oct 16, 2023
@bdb-dd bdb-dd changed the title Improve context stuffing for RAG Use query understanding for RAG retrieval Oct 17, 2023
@bdb-dd bdb-dd self-assigned this Oct 23, 2023
@bdb-dd
Copy link
Collaborator Author

bdb-dd commented Oct 23, 2023

First end to end test completed. Initial results look very promising.

Will likely require additional testing and content improvements to deal with issues related to certain topics.

Certain documents should probably be included in context regardless of search terms.

@bdb-dd
Copy link
Collaborator Author

bdb-dd commented Oct 27, 2023

Sent invitation to a broader group of people who can contribute with a varied set of user queries. Quickly finding examples where the first stage, extract search terms, is not as selective as it could be. A large number of search terms currently results in a smaller result set, sometimes including documents that are highly ranked for no apparent reason.

Have also tested asking GPT 3.5 for feedback on which of the supplied context documents were relevant, with good results. So one option would be to "pin" certain source documents, such that they are always included in the RAG context. The context length has varied significantly from query to query, sometimes exceeding 16K which is our current upper limit.

@bdb-dd bdb-dd transferred this issue from another repository Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature-request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant