Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IN operator fails in Prisma vector store #6082

Closed
5 tasks done
shan-mx opened this issue Jul 16, 2024 · 2 comments · Fixed by #6085
Closed
5 tasks done

IN operator fails in Prisma vector store #6082

shan-mx opened this issue Jul 16, 2024 · 2 comments · Fixed by #6085
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@shan-mx
Copy link
Contributor

shan-mx commented Jul 16, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Most of the code are copied from the documentation. The embedding model is changed to jina and the id of Document is changed to Int with autoincrement. The postgres instance is created under the instruction in documentation.

import type { Document } from "@prisma/client";

import { JinaEmbeddings } from "@langchain/community/embeddings/jina";
import { PrismaVectorStore } from "@langchain/community/vectorstores/prisma";
import { Prisma, PrismaClient } from "@prisma/client";

import { env } from "./lib/env.js";

const embeddings = new JinaEmbeddings({
  apiKey: env.JINA_API_KEY,
  model: "jina-embeddings-v2-base-en",
});

const db = new PrismaClient();

const vectorStore = PrismaVectorStore.withModel<Document>(db).create(
  embeddings,
  {
    prisma: Prisma,
    tableName: "Document",
    vectorColumnName: "vector",
    columns: {
      id: PrismaVectorStore.IdColumn,
      content: PrismaVectorStore.ContentColumn,
    },
  },
);

const texts = ["Hello world", "Bye bye", "What's this?"];

const docs = await db.$transaction(
  texts.map((content) => db.document.create({ data: { content } })),
);

await vectorStore.addModels(docs);

const resultOne = await vectorStore.similaritySearch("Hello world", 1, {
  id: {
    in: docs.map((doc) => doc.id),
  },
});

console.log(resultOne);

Prisma Schema:

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model Document {
  id      Int                    @id @default(autoincrement())
  content String
  vector  Unsupported("vector")?
}

Error Message and Stack Trace (if applicable)

Error: Invalid filter: IN operator requires an array of strings. Received: [
7,
8,
9
]

file:///Users/a/DevProjects/prisma-vector-test/node_modules/.pnpm/@langchain+community@0.2.19_ignore@5.3.1_openai@4.52.7/node_modules/@langchain/community/dist/vectorstores/prisma.js:246
throw new Error(Invalid filter: IN operator requires an array of strings. Received: ${JSON.stringify(value, null, 2)});

Description

I'm trying to use IN operator to filter the result in prisma vector store query. The argument are correctly inferred to number[], but I received a runtime error saying that IN operator requires an array of strings. I cannot change them to string[] as it will led to a type error.

System Info

@langchain/community@0.2.19 | MIT | deps: 11 | versions: 89
Third-party integrations for LangChain.js
https://github.com/langchain-ai/langchainjs/tree/main/libs/langchain-community/

platform mac

Node v20.11.1

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jul 16, 2024
@shan-mx
Copy link
Contributor Author

shan-mx commented Jul 16, 2024

The issue comes from this line:

!value.every((v) => typeof v === "string")

It simply rejects all non-string arrays when building sql query, but the argument of IN can have various types in real cases. After the removal of this line, the problem is solved.
I will open a PR to fix this.

Copy link

dosubot bot commented Jul 16, 2024

Hey there, @shan-mx! 👋 I'm here to help you with any bugs or questions you have while waiting for a human maintainer. Let's squash some bugs together!

The issue you're encountering with the IN operator in the Prisma vector store query is due to the operator requiring an array of strings, but the argument is inferred as number[]. This is because the code currently rejects all non-string arrays when building the SQL query.

To resolve this, you need to convert the number[] to string[] before using it in the IN operator. Here's an example of how you can do this:

const numberArray = docs.map((doc) => doc.id); // Example number array
const stringArray = numberArray.map(String); // Convert number[] to string[]

// Now use stringArray with the IN operator
const resultOne = await vectorStore.similaritySearch("Hello world", 1, {
  id: {
    in: stringArray,
  },
});

console.log(resultOne);

By converting the number[] to string[], you ensure that the IN operator receives the correct type of values, thus resolving the runtime error.

Additionally, it seems that the issue stems from a specific line in the prisma.ts file that rejects non-string arrays. You mentioned that removing this line solves the problem, and you plan to open a PR to fix it. This approach should help address the root cause of the issue.

If you need further assistance, @jacoblee93 might be able to provide additional guidance.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant