Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/Add support for state-based metadata filter to Retriever Tool #3501

Conversation

serhiy-matoffo
Copy link
Contributor

Extended Retriever Tool to support metadata filter based on the object stored in agent state.

@HenryHengZJ
Copy link
Contributor

Interesting! do you have an example use case?

@serhiy-matoffo
Copy link
Contributor Author

serhiy-matoffo commented Nov 11, 2024

Sure. We've found that our RAG flow often returns irrelevant documents. To be a bit more precise, result documents contain some content related to some words used by the user but are found in documents on a completely different subject. So, we've tasked one of our LLM Nodes to figure out the subject of the conversation and save that to state (at this point we can form a metadata filter object). Later down the line, we would like to take that subject and use it to narrow down the query to the vector store.

Is there a way to implement this flow using existing functionality? I couldn't find one and had to add this property to a Retriever Tool.

@HenryHengZJ
Copy link
Contributor

I was thinking of how to make it more generic..

How bout this?
image

We can have an additional metadata filter, this adds to the existing filter from vector store if there is any. And users can filter using the existing flow variable such as $flow.state, $flow.chatId etc

@serhiy-matoffo
Copy link
Contributor Author

This would not allow complex/nested filters, right?

@HenryHengZJ
Copy link
Contributor

This would not allow complex/nested filters, right?

Do you have an example of that? Under the hood it's just a simple JSON that adds on top of existing filter

@serhiy-matoffo
Copy link
Contributor Author

Do you have an example of that?

Here is an example from pinecone documentation: {"$and": [{"genre": "comedy"}, {"genre":"documentary"}]}

Would the functionality you suggested support this kind of syntax?

@HenryHengZJ
Copy link
Contributor

HenryHengZJ commented Nov 13, 2024

yep definitely. If thats okay I can make the changes on this PR, you can then try it out

@serhiy-matoffo
Copy link
Contributor Author

Sounds good!

@HenryHengZJ
Copy link
Contributor

Updated, try and test see if it works @serhiy-matoffo

@serhiy-matoffo
Copy link
Contributor Author

I can confirm that it works great, thanks!

Also, this trick allows using any dynamic object from the flow state:

image

I just prefer to form that object using a script instead of typing in all the properties through those boxes 😄

Copy link
Contributor

@HenryHengZJ HenryHengZJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merging

@HenryHengZJ HenryHengZJ merged commit 16ceed1 into FlowiseAI:main Nov 16, 2024
2 checks passed
@sirsimonson
Copy link
Contributor

sirsimonson commented Nov 19, 2024

so excited to try this out. As of now I'm trying to use metadata filter like so for postgres:
image
Somehow this does not work and returns an emptry list. is this a known issue or am I just forming it wrong? I'm looking for docs about this filter spec but can't find any. @HenryHengZJ @serhiy-matoffo

EDIT:

image
According to https://js.langchain.com/docs/integrations/vectorstores/pgvector/ filtering in pgvector is only supported via "in" or "arrayContains" operator in nodejs. It would be so cool to have exhaustive filter functionality as in the python langchain version of pgvector as can be found here: https://python.langchain.com/docs/integrations/vectorstores/pgvector/

We still plan to adopt this feature and dynamically filter from agent state. Unfortunately the most power lies in integrating it with pinecone.

@sirsimonson
Copy link
Contributor

sirsimonson commented Nov 21, 2024

@HenryHengZJ Please advise .. Also the above filter mechanism (which had worked through "Test Retrieval Query" doesnt work for this new feature. Somehow the filter is not passed down to the postgres query.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants