Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doing RAG? Vector search is *not* enough #879

Open
1 task
ShellLM opened this issue Aug 11, 2024 · 1 comment
Open
1 task

Doing RAG? Vector search is *not* enough #879

ShellLM opened this issue Aug 11, 2024 · 1 comment
Labels
AI-Agents Autonomous AI agents using LLMs Algorithms Sorting, Learning or Classifying. All algorithms go here. MachineLearning ML Models, Training and Inference RAG Retrieval Augmented Generation for LLMs

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 11, 2024

Doing RAG? Vector search is not enough

Snippet

"I'm concerned by the number of times I've heard, "oh, we can do RAG with retriever X, here's the vector search query." Yes, your retriever for a RAG flow should definitely support vector search, since that will let you find documents with similar semantics to a user's query, but vector search is not enough. Your retriever should support a full hybrid search, meaning that it can perform both a vector search and full text search, then merge and re-rank the results. That will allow your RAG flow to find both semantically similar concepts, but also find exact matches like proper names, IDs, and numbers."

Full Content

Hybrid search steps

Azure AI Search offers a full hybrid search with all those components:

  1. It performs a vector search using a distance metric (typically cosine or dot product).
  2. It performs a full-text search using the BM25 scoring algorithm.
  3. It merges the results using Reciprocal Rank Fusion algorithm.
  4. It re-ranks the results using semantic ranker, a machine learning model used by Bing, that compares each result to the original usery query and assigns a score from 0-4.

The search team even researched all the options against a standard dataset, and wrote a blog post comparing the retrieval results for full text search only, vector search only, hybrid search only, and hybrid plus ranker. Unsurprisingly, they found that the best results came from using the full stack, and that's why it's the default configuration we use in the AI Search RAG starter app.

When is hybrid search needed?

To demonstrate the importance of going beyond vector search, I'll show some queries based off the sample documents in the AI Search RAG starter app. Those documents are from a fictional company and discuss internal policies like healthcare and benefits.

Let's start by searching "what plan costs $45.00?" with a pure vector search using an AI Search index:

search_query = "what plan costs $45.00"
search_vector = get_embedding(search_query)
r = search_client.search(None, top=3, vector_queries=[
  VectorizedQuery(search_vector, k_nearest_neighbors=50, fields="embedding")])

The results for that query contain numbers and costs, like the string "The copayment for primary care visits is typically around $20, while specialist visits have a copayment of around $50.", but none of the results contain an exact cost of $45.00, what the user was looking for.

Now let's try that query with a pure full-text search:

r = search_client.search(search_query, top=3)

The top result for that query contain a table of costs for the health insurance plans, with a row containing $45.00.

Of course, we don't want to be limited to full text queries, since many user queries would be better answered by vector search, so let's try this query with hybrid:

r = search_client.search(search_query, top=15, vector_queries=[
  VectorizedQuery(search_vector, k_nearest_neighbors=10, fields="embedding")])

Once again, the top result is the table with the costs and exact string of $45.00. When the user asks that question in the context of the full RAG app, they get the answer they were hoping for.

Suggested labels

None

@ShellLM ShellLM added AI-Agents Autonomous AI agents using LLMs Algorithms Sorting, Learning or Classifying. All algorithms go here. MachineLearning ML Models, Training and Inference RAG Retrieval Augmented Generation for LLMs labels Aug 11, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 11, 2024

Related content

#170 similarity score: 0.87
#386 similarity score: 0.87
#85 similarity score: 0.86
#739 similarity score: 0.86
#315 similarity score: 0.86
#778 similarity score: 0.86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Agents Autonomous AI agents using LLMs Algorithms Sorting, Learning or Classifying. All algorithms go here. MachineLearning ML Models, Training and Inference RAG Retrieval Augmented Generation for LLMs
Projects
None yet
Development

No branches or pull requests

1 participant