-
In RAG systems, can we apply metadata filters before similarity calculations to limit the dense vector space of embeddings, ensuring that the retrieved results are always relevant from a metadata standpoint . IS there any approach that fit this architecture? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Yes, you can apply metadata filters, you just have to determine what metadata is appropriate to filter by. For example, in the AI Search RAG solution, we apply filters for selected categories and filters for data access control: https://github.com/Azure-Samples/azure-search-openai-demo/blob/a8b1202045294052bf86bb7e523d25ef270c0d8c/app/backend/approaches/approach.py#L120 We then pass those filters along with our search query, and AI Search applies the filters before doing the hybrid search. You could also determine filters by looking at the users query and extracting filter values from it. I do that in our RAG Postgres solution, by using function calling to have the LLM suggest SQL column filters based off the user's query. |
Beta Was this translation helpful? Give feedback.
Yes, you can apply metadata filters, you just have to determine what metadata is appropriate to filter by. For example, in the AI Search RAG solution, we apply filters for selected categories and filters for data access control: https://github.com/Azure-Samples/azure-search-openai-demo/blob/a8b1202045294052bf86bb7e523d25ef270c0d8c/app/backend/approaches/approach.py#L120
The category filter is determined by a UI field, and the data access control filter is based on the logged in user's user ID and Entra groups.
We then pass those filters along with our search query, and AI Search applies the filters before doing the hybrid search.
You could also determine filters by looking at the users qu…