You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a deeplake vector database with code chunks of a project. According to an issue I want to find the corresponding code chunks. For this I have written a SelfQueryRetriever.
But it throws an error exactly when I mention an expression like 'train.py script' in the query. If I leave this out, I get no error. The whole thing is supposed to work automatically for all possible issues, so it is not possible to simply say to keep such expressions out of the issues.
def CustomRetriever(files, dataset_path,issue):
metadata_field_info = [
AttributeInfo(
name="source",
description="The soruce file the chunk was extracted from",
type="string",
),
AttributeInfo(
name="file_name",
description="The name of the file the chunk was extracted from",
type="string",
),
AttributeInfo(
name="chunk_id",
description="the id of the chunk",
type="string",
),
]
document_content_description = "The sourcecode of a project"
model = ChatOpenAI(model="gpt-4")
embeddings = OpenAIEmbeddings(disallowed_special=())
db = DeepLake(dataset_path=dataset_path, read_only=True, embedding=embeddings, exec_option='python')
docs = (db.similarity_search(query=" ", k=10000000))
retriever = SelfQueryRetriever.from_llm(
model, db, document_content_description, metadata_field_info, verbose=True
)
try:
# Ihr Code, der den Fehler verursacht
print('TEST', retriever.get_relevant_documents(
f"Which documents contain code to resolve the following issue? -> {issue}"))
except ValueError as e:
print(traceback.format_exc())
Here is the error:
query='CNN instead of BERT model in train.py script, handle data better, generated using Tensorflow, integrated into logic, adapted to word vectors, change code' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='source', value='train.py'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='file_name', value='train.py')]) limit=None
Traceback (most recent call last):
File "/Users/kaanerbay/GitHub/Github_Issue_Solver/langchainLogic/retriever2.py", line 93, in CustomRetriever
print('TEST', retriever.get_relevant_documents(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/schema/retriever.py", line 208, in get_relevant_documents
raise e
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/schema/retriever.py", line 201, in get_relevant_documents
result = self._get_relevant_documents(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/retrievers/self_query/base.py", line 135, in _get_relevant_documents
docs = self.vectorstore.search(new_query, self.search_type, **search_kwargs)
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 121, in search
return self.similarity_search(query, **kwargs)
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 475, in similarity_search
return self._search(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 348, in _search
return self._search_tql(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 267, in _search_tql
result = self.vectorstore.search(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 429, in search
utils.parse_search_args(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/deeplake/core/vectorstore/vector_search/utils.py", line 229, in parse_search_args
raise ValueError(
ValueError: User-specified TQL queries are not support for exec_option=python.
Here is the used issue:
a CNN should be used instead of the BERT model in the train.py script, because it can handle the type of data better.
The CNN should not be too complex, but also not too simple and should be generated using Tensorflow.
The CNN should be integrated into the logic and adapted according to the word vectors used. Change the code of it, as good as you can.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have a deeplake vector database with code chunks of a project. According to an issue I want to find the corresponding code chunks. For this I have written a SelfQueryRetriever.
But it throws an error exactly when I mention an expression like 'train.py script' in the query. If I leave this out, I get no error. The whole thing is supposed to work automatically for all possible issues, so it is not possible to simply say to keep such expressions out of the issues.
Here is the error:
Here is the used issue:
Beta Was this translation helpful? Give feedback.
All reactions