Doing RAG? Vector search is *not* enough #879
Labels
AI-Agents
Autonomous AI agents using LLMs
Algorithms
Sorting, Learning or Classifying. All algorithms go here.
MachineLearning
ML Models, Training and Inference
RAG
Retrieval Augmented Generation for LLMs
Doing RAG? Vector search is not enough
Snippet
"I'm concerned by the number of times I've heard, "oh, we can do RAG with retriever X, here's the vector search query." Yes, your retriever for a RAG flow should definitely support vector search, since that will let you find documents with similar semantics to a user's query, but vector search is not enough. Your retriever should support a full hybrid search, meaning that it can perform both a vector search and full text search, then merge and re-rank the results. That will allow your RAG flow to find both semantically similar concepts, but also find exact matches like proper names, IDs, and numbers."
Full Content
Hybrid search steps
Azure AI Search offers a full hybrid search with all those components:
The search team even researched all the options against a standard dataset, and wrote a blog post comparing the retrieval results for full text search only, vector search only, hybrid search only, and hybrid plus ranker. Unsurprisingly, they found that the best results came from using the full stack, and that's why it's the default configuration we use in the AI Search RAG starter app.
When is hybrid search needed?
To demonstrate the importance of going beyond vector search, I'll show some queries based off the sample documents in the AI Search RAG starter app. Those documents are from a fictional company and discuss internal policies like healthcare and benefits.
Let's start by searching "what plan costs $45.00?" with a pure vector search using an AI Search index:
The results for that query contain numbers and costs, like the string "The copayment for primary care visits is typically around $20, while specialist visits have a copayment of around $50.", but none of the results contain an exact cost of $45.00, what the user was looking for.
Now let's try that query with a pure full-text search:
The top result for that query contain a table of costs for the health insurance plans, with a row containing $45.00.
Of course, we don't want to be limited to full text queries, since many user queries would be better answered by vector search, so let's try this query with hybrid:
Once again, the top result is the table with the costs and exact string of $45.00. When the user asks that question in the context of the full RAG app, they get the answer they were hoping for.
Suggested labels
None
The text was updated successfully, but these errors were encountered: