Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added RAG settings to settings.py, vector_store and chat_service to a… #1771

Merged
merged 3 commits into from
Mar 20, 2024

Conversation

icsy7867
Copy link
Contributor

Since i was out of town and there were some rather large changes to the code, I am doing a new PR for cleanliness.

#1715 was the original.

In the settings.yaml file I added two new settings. Both of these are important to use together IMO.

rag:
  similarity_top_k: 2
  #This value controls how many "top" documents the RAG returns to use in the context.
  #similarity_value: 0.45
  #This value is disabled by default.  If you enable this settings, the RAG will only use articles that meet a certain percentage score.

similarity_top_k controls how many of the "top" results are returned by the RAG Pipeline. Since i am ingesting lots of information for our organization from many different areas, sometimes I might have 5, 6 or 7 relevant sources.

However the risk of increasing this value, is that you get a lot of junk that does not relate other than having a key word or something. So similarity_value comes into play here and is used in the chat_service.py file.

This scare requires the RAG to match a certain level of the document, or it is thrown out. In my test cases 0.40 works well to weed out some of the lower end junk that comes out when you increase similarity_top_k to a higher value.

Copy link
Collaborator

@imartinez imartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added two alternatives to an issue I found. I'd really advice for the second option, even if it implies changing a little more of your PR. Otherwise, looking great!

@@ -135,6 +135,7 @@ def get_retriever(
similarity_top_k: int = 2,
) -> VectorIndexRetriever:
# This way we support qdrant (using doc_ids) and the rest (using filters)
similarity_top_k = self.settings.rag.similarity_top_k
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are effectively making the similarity_top_k function param useless, because you are overriding it. I'd suggest to respect it if set. Something like this should work:

def get_retriever(
        self,
        index: VectorStoreIndex,
        context_filter: ContextFilter | None = None,
        similarity_top_k: int | None = None,
    ) -> VectorIndexRetriever:
        # This way we support qdrant (using doc_ids) and the rest (using filters)
        similarity_top_k = similarity_top_k or self.settings.rag.similarity_top_k

There is an alternative which I like more because is more scalable as the project evolves than the current solution of impacting the "get_retriever" for the whole application:

  • change settings.rag.similarity_top_k to settings.context_chat_rag.similarity_top_k
  • in chat_service.py, when get_retriever is called, pass the similarity_top_k param getting it from that setting

That way the setting only applies to the contextual chat use case, and whenever we add more use cases that require a different amount of top_k retrieved, we can fine tune it in different settings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the code, except i left the settings value as "rag" instead of context_chat_rag. I basically ran out of time, and need to switch gear. But I can change that settings name to context_chat_rag if you prefer that.

Copy link
Collaborator

@imartinez imartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is perfectly fine to call it RagSettings given the only "RAG" that is in PrivateGPT at the moment is the contextual chat

@imartinez imartinez merged commit 087cb0b into zylon-ai:main Mar 20, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants