Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error creating Supabase vector store when running self-query example code #10447

Closed
2 of 14 tasks
tleyden opened this issue Sep 11, 2023 · 6 comments
Closed
2 of 14 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: vector store Related to vector store module

Comments

@tleyden
Copy link

tleyden commented Sep 11, 2023

System Info

Langchain: 0.0.285
Platform: OSX Ventura (apple silicon)
Python version: 3.11

Who can help?

@gregnr since it looks like you added the Supabase example code

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

  1. Create fresh conda env with python 3.11
  2. Install JupyterLap and create notebook
  3. Follow the steps in the Supabase example code tutorial

On the step to:

vectorstore = SupabaseVectorStore.from_documents(
    docs, 
    embeddings, 
    client=supabase, 
    table_name="documents", 
    query_name="match_documents"
)

it fails with error JSONDecodeError: Expecting value: line 1 column 1 (char 0):

Stacktrace
---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
Cell In[10], line 1
----> 1 vectorstore = SupabaseVectorStore.from_documents(
      2     docs, 
      3     embeddings, 
      4     client=supabase, 
      5     table_name="documents", 
      6     query_name="match_documents"
      7 )

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/langchain/vectorstores/base.py:417, in VectorStore.from_documents(cls, documents, embedding, **kwargs)
    415 texts = [d.page_content for d in documents]
    416 metadatas = [d.metadata for d in documents]
--> 417 return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/langchain/vectorstores/supabase.py:147, in SupabaseVectorStore.from_texts(cls, texts, embedding, metadatas, client, table_name, query_name, ids, **kwargs)
    145 ids = [str(uuid.uuid4()) for _ in texts]
    146 docs = cls._texts_to_documents(texts, metadatas)
--> 147 cls._add_vectors(client, table_name, embeddings, docs, ids)
    149 return cls(
    150     client=client,
    151     embedding=embedding,
    152     table_name=table_name,
    153     query_name=query_name,
    154 )

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/langchain/vectorstores/supabase.py:323, in SupabaseVectorStore._add_vectors(client, table_name, vectors, documents, ids)
    320 for i in range(0, len(rows), chunk_size):
    321     chunk = rows[i : i + chunk_size]
--> 323     result = client.from_(table_name).upsert(chunk).execute()  # type: ignore
    325     if len(result.data) == 0:
    326         raise Exception("Error inserting: No rows added")

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/postgrest/_sync/request_builder.py:62, in SyncQueryRequestBuilder.execute(self)
     53 r = self.session.request(
     54     self.http_method,
     55     self.path,
   (...)
     58     headers=self.headers,
     59 )
     61 try:
---> 62     return APIResponse.from_http_request_response(r)
     63 except ValidationError as e:
     64     raise APIError(r.json()) from e

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/postgrest/base_request_builder.py:154, in APIResponse.from_http_request_response(cls, request_response)
    150 @classmethod
    151 def from_http_request_response(
    152     cls: Type[APIResponse], request_response: RequestResponse
    153 ) -> APIResponse:
--> 154     data = request_response.json()
    155     count = cls._get_count_from_http_request_response(request_response)
    156     return cls(data=data, count=count)

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/site-packages/httpx/_models.py:756, in Response.json(self, **kwargs)
    754     if encoding is not None:
    755         return jsonlib.loads(self.content.decode(encoding), **kwargs)
--> 756 return jsonlib.loads(self.text, **kwargs)

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    341     s = s.decode(detect_encoding(s), 'surrogatepass')
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:
    348     cls = JSONDecoder

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
    332 def decode(self, s, _w=WHITESPACE.match):
    333     """Return the Python representation of ``s`` (a ``str`` instance
    334     containing a JSON document).
    335 
    336     """
--> 337     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338     end = _w(s, end).end()
    339     if end != len(s):

File /opt/miniconda3/envs/self-query-experiment/lib/python3.11/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
    353     obj, end = self.scan_once(s, idx)
    354 except StopIteration as err:
--> 355     raise JSONDecodeError("Expecting value", s, err.value) from None
    356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

It appears that Supabase is returning a 201 response code, with an empty body in the response. Then the posgrest library is trying to parse the json with data = request_response.json(), but that fails due to the empty body.

Are there some extra headers that should be added to the supabase client to tell it return a response body?

Expected behavior

No error when invoking SupabaseVectorStore.from_documents()

@dosubot dosubot bot added Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Sep 11, 2023
@dosubot
Copy link

dosubot bot commented Sep 11, 2023

🤖

Hello,

Thank you for providing such a detailed report of the issue you're experiencing. It seems like the problem is occurring when the Supabase client is trying to parse an empty response body as JSON, which is causing the JSONDecodeError.

The SupabaseVectorStore.from_documents() method is indeed supposed to create a new SupabaseVectorStore instance from a list of documents and their corresponding embeddings. It does this by calling the _add_vectors() method, which adds these vectors to the specified Supabase table.

The error you're encountering seems to be happening when the _add_vectors() method is trying to execute the upsert operation on the Supabase client. The upsert operation is supposed to insert new rows into the table or update existing rows if they already exist. However, it seems like the Supabase client is returning a 201 response with an empty body, which is causing the JSONDecodeError when the Postgrest library tries to parse the response body as JSON.

As for your question about whether there are extra headers that should be added to the Supabase client to tell it to return a response body, I'm not aware of any such headers. The Supabase client should automatically return a response body when the upsert operation is executed.

I would suggest a few things to troubleshoot this issue:

  1. Check if the Supabase client is correctly configured. Make sure that the SUPABASE_URL and SUPABASE_ANON_KEY environment variables are correctly set.

  2. Check if the table specified in the table_name argument of the SupabaseVectorStore.from_documents() method exists in your Supabase database.

  3. Try executing the upsert operation manually using the Supabase client to see if it returns a response body. You can do this by running the following code:

client = supabase.create_client(SUPABASE_URL, SUPABASE_ANON_KEY)
result = client.from_("your_table_name").upsert({"id": "test_id", "content": "test_content", "embedding": [0.0, 0.0, 0.0], "metadata": {"test": "metadata"}}).execute()
print(result.data)

Please replace "your_table_name" with the name of your table, and make sure to import the supabase module and set the SUPABASE_URL and SUPABASE_ANON_KEY variables before running this code.

If the upsert operation still returns an empty response body, then the issue might be with the Supabase client or the Supabase service itself. In that case, I would recommend reaching out to the Supabase community for help.

I hope this helps! Please let me know if you have any other questions or if there's anything else I can do to assist you.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@tleyden
Copy link
Author

tleyden commented Sep 11, 2023

I think I found the issue, it looks like it's specific to python 3.11 and the way it treats enums. When downgrading to python 3.8 the issue goes away.

I added a debug statement to this line:

print(f"Self.headers['Prefer']: {self.headers['Prefer']}")

Python 3.8 output:

Self.headers['Prefer']: return=representation,resolution=merge-duplicates

Python 3.11 output:

Self.headers['Prefer']: return=ReturnMethod.representation,resolution=merge-duplicates

With python 3.11, the ReturnMethod.representation enum field is formatted as ReturnMethod.representation (which is probably ignored by backend and it reverts to return=minimal mode), whereas in python 3.8 is is formatted as representation (as expected by the backend)

@tleyden
Copy link
Author

tleyden commented Sep 11, 2023

It was already fixed in postgrext in supabase/postgrest-py#169, but since this example code pins the version to supabase 1.0.0 then postgrext version '0.10.3' is used, which doesn't have the fix.

Any reason why this only works with the older supabase 1.0.0 package? If it can't be upgraded to uptake the fix, it's worth adding a warning to the docs to only use python 3.10 or earlier. I would volunteer to send a quick PR if it would help.

@gregnr
Copy link
Contributor

gregnr commented Sep 11, 2023

@tleyden Nice work debugging this one. supabase was pinned to v1.0.0 as a short term solution to a dependency error - I can't remember the exact issue, will take a look and report back.

@gregnr
Copy link
Contributor

gregnr commented Sep 11, 2023

@tleyden the error that required supabase=1.0.0 seems to be no longer present. I've created a PR #10452 that should fix both this issue and #10444 by:

  • Removing v1.0.0 pin on supabase
  • Installing langchain with openai extra which includes tiktoken

Do you mind giving the new notebook a go and let me know if that solves everything?
https://github.com/langchain-ai/langchain/blob/ec47f1dedc7032e52a3fe7bdcb9ed296c79e2b7a/docs/extras/modules/data_connection/retrievers/self_query/supabase_self_query.ipynb

baskaryan added a commit that referenced this issue Sep 11, 2023
## Description
Fixes dependency errors when using Supabase self-query retrievers on
Python 3.11

## Issues
- #10447
- #10444

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
@tleyden
Copy link
Author

tleyden commented Sep 12, 2023

Thanks @gregnr, I think this should fix it! I’ll proactively close this out but will reopen if I come across any more issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

2 participants