Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[langchain-sqlserver] Error when inserting a lot of documents #11

Open
yorek opened this issue Dec 2, 2024 · 5 comments
Open

[langchain-sqlserver] Error when inserting a lot of documents #11

yorek opened this issue Dec 2, 2024 · 5 comments
Assignees
Labels
P0 P0 Priority items sql-server Issues related to langchain-sqlserver

Comments

@yorek
Copy link
Collaborator

yorek commented Dec 2, 2024

When adding a lot of documents to the vectorstore using the from_documents function, I get the following error:

2024-12-02 11:02:35,662 - ERROR - Add text failed:
 ('07002', '[07002] [Microsoft][ODBC Driver 18 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')

Traceback (most recent call last):
  File "W:\_git\_owned\azure-sql-langchain\samples\agentic-rag.py", line 47, in <module>
    vector_store = SQLServer_VectorStore.from_documents(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "W:\_git\_owned\azure-sql-langchain\.venv\Lib\site-packages\langchain_sqlserver\vectorstores.py", line 578, in from_documents
    store.add_texts(texts, metadatas, ids, **kwargs)
  File "W:\_git\_owned\azure-sql-langchain\.venv\Lib\site-packages\langchain_sqlserver\vectorstores.py", line 836, in add_texts
    return self._insert_embeddings(texts, embedded_texts, metadatas, ids)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "W:\_git\_owned\azure-sql-langchain\.venv\Lib\site-packages\langchain_sqlserver\vectorstores.py", line 1239, in _insert_embeddings     
    raise Exception(e.__cause__) from None
Exception: ('07002', '[07002] [Microsoft][ODBC Driver 18 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')

It seems that if I try to insert less than 500 documents it works fine. With more it just return the above error.

@yorek yorek changed the title Error when inserting a lot of documents [SQL Server] Error when inserting a lot of documents Dec 2, 2024
@yorek yorek changed the title [SQL Server] Error when inserting a lot of documents [langchain-sqlserver] Error when inserting a lot of documents Dec 2, 2024
@anushakolan
Copy link
Collaborator

From the offline discussion, the plan is to use batching to insert the documents into the database.

  1. The batch_size an optional parameter for all the functions that allows inserting documents / data.
  2. The default batch_size will be 100.

@anushakolan anushakolan self-assigned this Jan 27, 2025
@yorek yorek added bug Something isn't working sql-server Issues related to langchain-sqlserver P0 P0 Priority items and removed bug Something isn't working labels Feb 6, 2025
@yorek yorek added this to the langchain-sqlserver-0.1.2 milestone Feb 7, 2025
@anushakolan
Copy link
Collaborator

anushakolan commented Feb 13, 2025

526 is giving this error message.

ERROR root:vectorstores.py:1308 Add text failed: ('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')

Image

@anushakolan
Copy link
Collaborator

But inserting 525 documents is giving a different error message,

ERROR root:vectorstores.py:1308 Add text failed: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request. (8003) (SQLExecDirectW)')

Image

@anushakolan
Copy link
Collaborator

Now inserting 524 documents is working fine,

Image

@anushakolan
Copy link
Collaborator

anushakolan commented Feb 13, 2025

Image

the number of parameters are calculated as 5 * (No_of_documents) + 1, as shown in above image, we get the following parameters for every document,

id: None custom_id_m0: 1 content_metadata_m0: {'id': 1, 'title': 'Document 1'} content_m0: This is the content of the page. embeddingvalues_1: [1.100672896551693, 0.6787781106600161, 0.2603396374499168]

And one additional parameter for embedding_length.

So for 5 documents in above case, we get 5 * 5 + 1, 26 parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 P0 Priority items sql-server Issues related to langchain-sqlserver
Projects
None yet
Development

No branches or pull requests

2 participants