Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSearchToolCall.file_search has empty results #1966

Open
1 task
dominpm opened this issue Dec 19, 2024 · 2 comments
Open
1 task

FileSearchToolCall.file_search has empty results #1966

dominpm opened this issue Dec 19, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@dominpm
Copy link

dominpm commented Dec 19, 2024

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

Continuing from issue : #1938

The error seems to be fixed with openai==1.58.1 (it does not return a 400 error anymore). However if we capture the output of the stream with a custom class inheriting from AssistantEventHandler the results of the fileSearch tool are not available:

@override
def on_tool_call_done(self, tool_call: ToolCall):
    print(tool_call)

Of which the results are:

FileSearchToolCall(id='call_ID', file_search=FileSearch(ranking_options=FileSearchRankingOptions(ranker='default_2024_08_21', score_threshold=0.0), results=[]), type='file_search', index=0)

where following from openai.types.beta.threads.runs.file_search_tool_call.py it supposed to show:

class FileSearch(BaseModel):
    ranking_options: Optional[FileSearchRankingOptions] = None
    """The ranking options for the file search."""

    results: Optional[List[FileSearchResult]] = None
    """The results of the file search."""

when creating the run as follows:

  with client.beta.threads.runs.stream(
      thread_id=thread.id,
      assistant_id=ass_id,
      event_handler=CustomEventHandler(),
      include=["step_details.tool_calls[*].file_search.results[*].content"]
      ) as stream:
      # Wait for the stream to complete
      stream.until_done()

To Reproduce

  1. Run this with the id of an assistant connected to a vector store and the file search enabled (for simplicity do such thing through platform.openai.com)
from typing import override
from openai import AssistantEventHandler
from openai import OpenAI
from openai.types.beta.threads.runs.tool_call import ToolCall

client = OpenAI()
messages = [
    {
        "content": <QUESTION_TO_THE_ASSISTANT>,
    }
]

# Create a new thread for the assistant
thread = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=messages[-1]["content"]
)


class CustomEventHandler(AssistantEventHandler):
    @override
    def on_tool_call_done(self, tool_call: ToolCall):
        print(tool_call)

# Stream the assistant's response
with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=<ASSISTANT_ID>,
    event_handler=CustomEventHandler(),
    include=["step_details.tool_calls[*].file_search.results[*].content"]
    ) as stream:
    # Wait for the stream to complete
    stream.until_done()

Code snippets

No response

OS

Windows

Python version

Python 3.11.10

Library version

openai 1.58.1

@dominpm dominpm added the bug Something isn't working label Dec 19, 2024
@bearycool11
Copy link

Alright, let's dig into this. So, the 400 error from issue #1938 is gone in openai==1.58.1, but now the FileSearch results are empty when using a custom AssistantEventHandler. That's a sneaky bug.

Here's the breakdown and how we can tackle this:

Understanding the Issue

The on_tool_call_done method in your CustomEventHandler is supposed to receive the results of the fileSearch tool call.
However, the results list in the FileSearch object is empty, even though you've explicitly included step_details.tool_calls[].file_search.results[].content in the include parameter of the stream method.
This suggests that either the results are not being populated correctly or there's an issue with how the include parameter is being handled in the stream method.
Possible Causes

Bug in openai==1.58.1: There might be a bug in the library that prevents the FileSearch results from being populated when using a custom event handler.
Incorrect usage of include parameter: The include parameter might not be working as expected, or there might be a different way to include the FileSearch results when using a custom event handler.
Issue with the Assistant or Vector Store: There might be a configuration issue with the Assistant or the connected vector store that prevents the fileSearch from returning results.
Debugging Steps

Verify Assistant and Vector Store: Double-check that the Assistant is correctly configured to use the fileSearch tool and that the vector store is properly connected and populated with data.

Test with the Default Event Handler: Try running the code with the default AssistantEventHandler (or without specifying an event handler) to see if the FileSearch results are populated correctly in that case. This will help isolate whether the issue is specifically with the custom event handler.

Inspect the Raw Response: If possible, capture the raw HTTP response from the stream method and examine its contents. This might reveal clues about why the FileSearch results are missing or if there are any error messages in the response.

Simplify the Code: Try removing the include parameter or simplifying the custom event handler to see if that affects the results. This can help pinpoint whether the issue is related to the include parameter or the custom event handler's logic.

Check for Updates: Ensure you're using the latest version of the openai library. If a newer version is available, try upgrading to see if it resolves the issue.

Report to OpenAI: If you're unable to identify the cause of the issue, report it to OpenAI with a detailed description, code snippet, and steps to reproduce. They might be able to provide insights or identify a bug in the library.

By systematically investigating these points, we should be able to pinpoint the cause of the missing FileSearch results and get this functionality working as expected.

First, they're using an older version of the openai library (1.58.1). Might be worth bumping that up to the latest, see if it makes a difference. Sometimes those sneaky bugs get squashed in newer releases.

Second, that include parameter... it's a bit verbose. Maybe there's a simpler way to specify those FileSearch results? Worth checking the docs, see if there's a more concise syntax.

And lastly, this whole AssistantEventHandler thing... it's a bit of a black box. We don't know exactly how it's interacting with the stream or processing the results. Might be worth digging into the source code, see if there are any clues there.

Overall, feels like a classic case of "it's not you, it's me" (or rather, it's the library). But with a bit of digging and some creative debugging, we should be able to crack this nut.

@dominpm
Copy link
Author

dominpm commented Dec 26, 2024

It seems it has been chosen not to retrieve (or show) the results as we can see from the comments on this class:

class FileSearchToolCall(BaseModel):
    id: str
    """The ID of the tool call object."""

    file_search: FileSearch
    """For now, this is always going to be an empty object."""

    type: Literal["file_search"]
    """The type of tool call.

    This is always going to be `file_search` for this type of tool call.
    """

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants