Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The stop parameter in openai API doesn't work since v0.2.5 #1048

Open
llama-assistant opened this issue May 8, 2023 · 10 comments
Open

The stop parameter in openai API doesn't work since v0.2.5 #1048

llama-assistant opened this issue May 8, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@llama-assistant
Copy link

Since version v0.2.5, it seems the stop parameter in openai api is directly set conv.stop_str, rather than from request.
https://github.com/lm-sys/FastChat/blob/v0.2.5/fastchat/serve/api.py#L134

In version v0.2.3, it works when set in the request.
https://github.com/lm-sys/FastChat/blob/v0.2.3/fastchat/serve/api.py#L125

The stop parameter is a key when it works with ReAct in langchain, seems quite important to enable.

@merrymercy
Copy link
Member

Thanks for reporting this. Could you send a pull request to fix it?

@jstzwj
Copy link
Contributor

jstzwj commented May 8, 2023

Fixed in #818.

@merrymercy merrymercy added the bug Something isn't working label May 8, 2023
@llama-assistant
Copy link
Author

@jstzwj Thanks for the fix, happy to see it in next version.

@mingfang
Copy link
Contributor

mingfang commented May 19, 2023

In the openai_api_server, stop works for non-streaming completions, but not for streaming.

The problem is the unwanted stop sequence gets streamed out before stopping.
https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/openai_api_server.py#L518

As a result, this breaks LangChain ReAct agents

@merrymercy
Copy link
Member

@andy-yang-1 does the new PR (#1246) fix this?
@mingfang If not, could you contribute a PR to fix it?

@mingfang
Copy link
Contributor

I tested the PR locally and it has the same problem.
@merrymercy do you think this problem should be fixed in
https://github.com/andy-yang-1/FastChat/blob/langchain-support/fastchat/serve/inference.py#L51
so that it doesn't emit the stop sequence?

@andy-yang-1
Copy link
Collaborator

@merrymercy My PR didn't fix the problem, how can we solve it?

@merrymercy
Copy link
Member

We handle the stop string here https://github.com/andy-yang-1/FastChat/blob/fae4087bbb6f7979b61f2e0c2912d77547a5c659/fastchat/serve/inference.py#L164-L175,
I think it will correctly delete the stop sequence finally. Does it occur during the middle of the streaming?

@mingfang
Copy link
Contributor

The problem happens when the previous generate token is the partial beginning of the stop sequence.
It will not match the entire stop sequence until the next few tokens.
As a result the partial stop sequence is stream the client, causing ReAct to fail.

@mingfang
Copy link
Contributor

@merrymercy
This is my PR with the stop detection fix
#1392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants