The stop parameter in openai API doesn't work since v0.2.5 #1048

llama-assistant · 2023-05-08T15:44:19Z

Since version v0.2.5, it seems the stop parameter in openai api is directly set conv.stop_str, rather than from request.
https://github.com/lm-sys/FastChat/blob/v0.2.5/fastchat/serve/api.py#L134

In version v0.2.3, it works when set in the request.
https://github.com/lm-sys/FastChat/blob/v0.2.3/fastchat/serve/api.py#L125

The stop parameter is a key when it works with ReAct in langchain, seems quite important to enable.

The text was updated successfully, but these errors were encountered:

merrymercy · 2023-05-08T16:57:33Z

Thanks for reporting this. Could you send a pull request to fix it?

jstzwj · 2023-05-08T19:17:37Z

Fixed in #818.

llama-assistant · 2023-05-09T00:42:09Z

@jstzwj Thanks for the fix, happy to see it in next version.

mingfang · 2023-05-19T15:04:28Z

In the openai_api_server, stop works for non-streaming completions, but not for streaming.

The problem is the unwanted stop sequence gets streamed out before stopping.
https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/openai_api_server.py#L518

As a result, this breaks LangChain ReAct agents

merrymercy · 2023-05-20T14:38:55Z

@andy-yang-1 does the new PR (#1246) fix this?
@mingfang If not, could you contribute a PR to fix it?

mingfang · 2023-05-20T14:43:21Z

I tested the PR locally and it has the same problem.
@merrymercy do you think this problem should be fixed in
https://github.com/andy-yang-1/FastChat/blob/langchain-support/fastchat/serve/inference.py#L51
so that it doesn't emit the stop sequence?

andy-yang-1 · 2023-05-20T14:49:19Z

@merrymercy My PR didn't fix the problem, how can we solve it?

merrymercy · 2023-05-20T15:04:29Z

We handle the stop string here https://github.com/andy-yang-1/FastChat/blob/fae4087bbb6f7979b61f2e0c2912d77547a5c659/fastchat/serve/inference.py#L164-L175,
I think it will correctly delete the stop sequence finally. Does it occur during the middle of the streaming?

mingfang · 2023-05-20T15:13:05Z

The problem happens when the previous generate token is the partial beginning of the stop sequence.
It will not match the entire stop sequence until the next few tokens.
As a result the partial stop sequence is stream the client, causing ReAct to fail.

mingfang · 2023-05-20T19:21:38Z

@merrymercy
This is my PR with the stop detection fix
#1392

merrymercy added the bug Something isn't working label May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The stop parameter in openai API doesn't work since v0.2.5 #1048

The stop parameter in openai API doesn't work since v0.2.5 #1048

llama-assistant commented May 8, 2023

merrymercy commented May 8, 2023

jstzwj commented May 8, 2023

llama-assistant commented May 9, 2023

mingfang commented May 19, 2023 •

edited

Loading

merrymercy commented May 20, 2023

mingfang commented May 20, 2023

andy-yang-1 commented May 20, 2023

merrymercy commented May 20, 2023

mingfang commented May 20, 2023

mingfang commented May 20, 2023

The stop parameter in openai API doesn't work since v0.2.5 #1048

The stop parameter in openai API doesn't work since v0.2.5 #1048

Comments

llama-assistant commented May 8, 2023

merrymercy commented May 8, 2023

jstzwj commented May 8, 2023

llama-assistant commented May 9, 2023

mingfang commented May 19, 2023 • edited Loading

merrymercy commented May 20, 2023

mingfang commented May 20, 2023

andy-yang-1 commented May 20, 2023

merrymercy commented May 20, 2023

mingfang commented May 20, 2023

mingfang commented May 20, 2023

mingfang commented May 19, 2023 •

edited

Loading