Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Assistants Streaming #367

Merged
merged 28 commits into from
May 16, 2024

Conversation

EthanBarlo
Copy link
Contributor

What:

  • Bug Fix
  • New Feature

Description:

This PR implements Full Assistants Streaming API support.
https://platform.openai.com/docs/api-reference/assistants-streaming/events

  • ThreadMessageDelta (Including nested classes)

  • ThreadRunStepDelta (Including nested classes)

  • StreamedThreadRunResponseFactory (To map the events to the correct classes)

  • EventStreamResponse (Modified iterator to work with the events format)

  • EventStreamResponseItem (Contains the event type and the data, similar to how the api returns the data)

Related:

#357

- Updated types
- Removed unnecessary error in StreamedThreadRunResponseFactory
- Ensured the types match the API responses correctly
…ce for assistant streaming"

This reverts commit d486e0e.
- Rolled back the formatting, and only commit the new streamed section
@EthanBarlo
Copy link
Contributor Author

I'm fairly certain this is fully feature complete with everything new within the API.
Let me know if anything's spotted that might be missing, or any updates that need to be made.

Currently defaulting it to an array,
While the 'delta' is generating the function call all of the data is not there. So its not actually useful, until the `thread.run.requires_action` event is sent. At which point we have the full function call mapped to the correct types anyway.
Copy link
Contributor

@knash94 knash94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done @EthanBarlo!

I was working on adding support for this myself, I made great progress and then saw your PR! I've loaded this up on my project and it seems to be working perfectly apart from the issue I've flagged with the import.

It would be neat to also support streaming for the create thread and run endpoint

@EthanBarlo
Copy link
Contributor Author

@knash94 Thanks for reviewing it and resolving the type issue.
I'm not doing any work with images, so wouldn't have spotted that one.

I've added the createAndRunStreamed function, and updated the contracts to match the new functionality.

Let me know if any more issues crop up from you're testing.

@eng-ahmad-sameer
Copy link
Contributor

Hey all,
Is there anybody available to review this PR and maybe merge and release it so we can use the streaming functionality?

Thx for your efforts.

@punyflash
Copy link
Contributor

@eng-ahmad-sameer, you can use it by aliasing open-ai client version on composer.json:

"repositories": [
    {
        "type": "vcs",
        "url": "https://github.com/EthanBarlo/OpenAi-php-client.git"
    }
],
"require": {
    "openai-php/client": "dev-Implement-Assistants-Streaming as 0.8.4",
}

@punyflash
Copy link
Contributor

punyflash commented Apr 9, 2024

@EthanBarlo was testing out and in most cases it works perfectly. Found 2 main issues:

  • Lack of types annotations for stream events. Needed to set them in place, so my ide could give me at least some hints what is going on
  • The example with tool output is not working. As soon as you set new $stream variable, foreach loop ends. Was able to run it using while loop and generator instance directly:
$stream = $client->threads()->runs()->createStreamed(
    threadId: 'thread_tKFLqzRN9n7MnyKKvc1Q7868',
    parameters: [
        'assistant_id' => 'asst_gxzBkD1wkKEloYqZ410pT5pd',
    ],
);

$iterator = $stream->getIterator();

while ($iterator->valid()) {
    $item = $iterator->current();
    $iterator->next();

    // ----

    if ($item->event === 'thread.run.requires_action') {
        $stream = $client->threads()->runs()->submitToolOutputsStreamed(
            threadId: $item->data->threadId,
            runId: $item->data->id,
            parameters: [
                'tool_outputs' => [[
                    'tool_call_id' => 'call_KSg14X7kZF2WDzlPhpQ168Mj',
                    'output' => '12',
                ]],
            ]
        );
        $iterator = $stream->getIterator();
    }

    // ----
}

@spont4e
Copy link

spont4e commented Apr 9, 2024

Hey all, Is there anybody available to review this PR and maybe merge and release it so we can use the streaming functionality?

Thx for your efforts.

🥇@gehrisandro and 🥈@nunomaduro are the people

@EthanBarlo
Copy link
Contributor Author

EthanBarlo commented Apr 12, 2024

@punyflash Thanks for pointing out these issues

  • Lack of types annotations for stream events. Needed to set them in place, so my ide could give me at least some hints what is going on

Yeah this is something worth adding, I don't actually know how to set those up.
So if you could point me to something or request the necessary changes, that'd be helpful.

  • The example with tool output is not working.

Yeah that example does need to be updated,
Heres an updated example, closer to how i've been using it.
I'll probably update the example in the readme with this.

$stream = $client->threads()->runs()->createStreamed(
    threadId: 'thread_tKFLqzRN9n7MnyKKvc1Q7868',
    parameters: [
        'assistant_id' => 'asst_gxzBkD1wkKEloYqZ410pT5pd',
    ],
);

do{
    foreach($stream as $response){
        $response->event // 'thread.run.created' | 'thread.run.in_progress' | .....
        $response->data // ThreadResponse | ThreadRunResponse | ThreadRunStepResponse | ThreadRunStepDeltaResponse | ThreadMessageResponse | ThreadMessageDeltaResponse

        switch($response->event){
            case 'thread.run.created':
            case 'thread.run.queued':
            case 'thread.run.completed':
            case 'thread.run.cancelling':
                $run = $response->data;
                break;
            case 'thread.run.expired':
            case 'thread.run.cancelled':
            case 'thread.run.failed':
                $run = $response->data;
                break 3;
            case 'thread.run.requires_action':
                // Overwrite the stream with the new stream started by submitting the tool outputs
                $stream = $client->threads()->runs()->submitToolOutputsStreamed(
                    threadId: $run->threadId,
                    runId: $run->id,
                    parameters: [
                        'tool_outputs' => [
                            [
                                'tool_call_id' => 'call_KSg14X7kZF2WDzlPhpQ168Mj',
                                'output' => '12',
                            ]
                        ],
                    ]
                );
                break;
        }
    }
} while ($run->status != "completed")

- Updated streaming with tools example as suggested by @punyflash
- Fixed undefined key error in ThreadMessageDeltaResponseContentText found by @eng-ahmad-sameer
- Resolved missing imports in ThreadMessageDeltaObject found by @eng-ahmad-sameer
@petermjr
Copy link

petermjr commented Apr 16, 2024

You can check these changes for adding faking capabilities. They can be implemented similarly in this branch
petermjr/openai-php@v0.8.4...v0.8.5

…dMessageDeltaResponse

Co-Authored-By: Ahmad Sameer <110471430+eng-ahmad-sameer@users.noreply.github.com>
@subet
Copy link

subet commented Apr 18, 2024

Hi,

Thanks for the great and full-featured SDK. Is there an estimated timeline for when this feature might be integrated and when we might expect the next release?

Regards

@EthanBarlo
Copy link
Contributor Author

@subet
Unfortunately I have no idea for a timeline on this.
In the meantime you can use the new functionality with this
#367 (comment)

Im quite confident that all the functionality has been implemented using the same style as the rest of the API.
We have at least 4 people using it on their projects without issues,
So I do think its close to being ready to go.

The next stage is for @gehrisandro and @nunomaduro to review the PR.

Copy link
Contributor

@gerdemann gerdemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two uses are missing.

…ssageDeltaResponseContentText

Co-authored-by: Michael Gerdemann <me+github@gerdemann.me>
@gehrisandro
Copy link
Collaborator

Hi @EthanBarlo

Thank you very much for your PR! I am going to review it now.

@slaven-ii
Copy link

Hi, I have been actively tracking the progress of this PR, but lately, it looks like there has been no movement after the tests were run. Is this what is blocking it or?

Fix broken tests on assistant streaming
@Guervyl
Copy link

Guervyl commented May 10, 2024

Hi there, I did not check the code to see the code.

My question is it cost optimize?

I'm new to open ai gpt. I don't know much about there pricing. But they say the have output cost for tokens.

The actual way this library do to fetch the response is by fetching all the messages: $messages = $client->threads()->messages()->list('...');. Does your code fetch the last message like this?

Because if the output tokens is about fetching the messages it would be bad fetching all messages again and again.

@EthanBarlo
Copy link
Contributor Author

@Guervyl
The output tokens refers to text generations from the Ai models.
So retrieving previous messages, and old text generations does not cost anything.

…istants-Streaming' into pr/367

# Conflicts:
#	src/Testing/Resources/ThreadsRunsTestResource.php
#	src/Testing/Resources/ThreadsTestResource.php
@gehrisandro gehrisandro merged commit 9a828d3 into openai-php:main May 16, 2024
8 checks passed
@gehrisandro
Copy link
Collaborator

Thanks, @EthanBarlo, for your PR!

Changed some stuff to work similar to the other streaming endpoints.

Will make a new release this week.

@magarrent
Copy link

Thank you guys! Amazing work.

Please note that the current version is pointing OpenAi-beta assistants=v1, it would crash if you're using some new parameters like file_search in v2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.