-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added RAG settings to settings.py, vector_store and chat_service to a… #1771
Conversation
…dd similarity_top_k and similarity_score
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added two alternatives to an issue I found. I'd really advice for the second option, even if it implies changing a little more of your PR. Otherwise, looking great!
@@ -135,6 +135,7 @@ def get_retriever( | |||
similarity_top_k: int = 2, | |||
) -> VectorIndexRetriever: | |||
# This way we support qdrant (using doc_ids) and the rest (using filters) | |||
similarity_top_k = self.settings.rag.similarity_top_k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are effectively making the similarity_top_k function param useless, because you are overriding it. I'd suggest to respect it if set. Something like this should work:
def get_retriever(
self,
index: VectorStoreIndex,
context_filter: ContextFilter | None = None,
similarity_top_k: int | None = None,
) -> VectorIndexRetriever:
# This way we support qdrant (using doc_ids) and the rest (using filters)
similarity_top_k = similarity_top_k or self.settings.rag.similarity_top_k
There is an alternative which I like more because is more scalable as the project evolves than the current solution of impacting the "get_retriever" for the whole application:
- change
settings.rag.similarity_top_k
tosettings.context_chat_rag.similarity_top_k
- in chat_service.py, when
get_retriever
is called, pass thesimilarity_top_k
param getting it from that setting
That way the setting only applies to the contextual chat use case, and whenever we add more use cases that require a different amount of top_k retrieved, we can fine tune it in different settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the code, except i left the settings value as "rag" instead of context_chat_rag. I basically ran out of time, and need to switch gear. But I can change that settings name to context_chat_rag if you prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is perfectly fine to call it RagSettings given the only "RAG" that is in PrivateGPT at the moment is the contextual chat
Since i was out of town and there were some rather large changes to the code, I am doing a new PR for cleanliness.
#1715 was the original.
In the settings.yaml file I added two new settings. Both of these are important to use together IMO.
similarity_top_k controls how many of the "top" results are returned by the RAG Pipeline. Since i am ingesting lots of information for our organization from many different areas, sometimes I might have 5, 6 or 7 relevant sources.
However the risk of increasing this value, is that you get a lot of junk that does not relate other than having a key word or something. So similarity_value comes into play here and is used in the chat_service.py file.
This scare requires the RAG to match a certain level of the document, or it is thrown out. In my test cases 0.40 works well to weed out some of the lower end junk that comes out when you increase similarity_top_k to a higher value.