fix stop detections #1392

mingfang · 2023-05-20T19:20:56Z

The current stop detection works but only after the partial stop sequence was already streamed.
This causes ReAct to break.

This change adds partial stop detection and avoids streaming it.

fastchat/serve/inference.py

suquark · 2023-05-21T01:11:55Z

@merrymercy any suggestions of testing this

mingfang · 2023-05-21T01:47:52Z

One way to test is to use this curl that simulates the LangChain calculator agent example.
The example is to ask what is 1+1.
It should respond by asking the calculator tool for 1+1, with the fix.

curl -d '{"model":"vicuna-13b-v1.1","temperature":0,"max_tokens":256,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"n":1,"best_of":1,"stop":["\nObservation: "],"stream":true,"prompt":["Answer the following questions as best you can. You have access to the following tools:\n\ncalculator: Useful for getting the result of a math expression. The input to this tool should be a valid mathematical expression that could be executed by a simple calculator.\n\nUse the following format in your response:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: what is 1+1\nThought:"]}' localhost:8000/v1/completions -H 'content-type: application/json'

The output should end like this

...more before this...
data: {"id": "cmpl-3ADpLjweTitvHHpddeV55R", "object": "text_completion", "model": "vicuna-13b-v1.1", "choices": [{"index": 0, "text": "+1", "logprobs": null, "finish_reason": null}]}

data: {"id": "cmpl-3ADpLjweTitvHHpddeV55R", "object": "text_completion", "model": "vicuna-13b-v1.1", "choices": [{"index": 0, "text": "", "logprobs": null, "finish_reason": "stop"}]}

data: [DONE]

merrymercy

LGTM

suquark

LGTM! Thanks!

plancktree · 2023-11-21T17:48:01Z

but wen

One way to test is to use this curl that simulates the LangChain calculator agent example. The example is to ask what is 1+1. It should respond by asking the calculator tool for 1+1, with the fix.

curl -d '{"model":"vicuna-13b-v1.1","temperature":0,"max_tokens":256,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"n":1,"best_of":1,"stop":["\nObservation: "],"stream":true,"prompt":["Answer the following questions as best you can. You have access to the following tools:\n\ncalculator: Useful for getting the result of a math expression. The input to this tool should be a valid mathematical expression that could be executed by a simple calculator.\n\nUse the following format in your response:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: what is 1+1\nThought:"]}' localhost:8000/v1/completions -H 'content-type: application/json'

The output should end like this

...more before this...
data: {"id": "cmpl-3ADpLjweTitvHHpddeV55R", "object": "text_completion", "model": "vicuna-13b-v1.1", "choices": [{"index": 0, "text": "+1", "logprobs": null, "finish_reason": null}]}

data: {"id": "cmpl-3ADpLjweTitvHHpddeV55R", "object": "text_completion", "model": "vicuna-13b-v1.1", "choices": [{"index": 0, "text": "", "logprobs": null, "finish_reason": "stop"}]}

data: [DONE]

but when I use vllm in 4 v100 GPUS the problem occurs
there is not "stop"
when I use vllm in 4 v100 GPUS the problem occurs
there is "stop" reason.I wander why

fix stop detections

c083920

mingfang mentioned this pull request May 20, 2023

The stop parameter in openai API doesn't work since v0.2.5 #1048

Open

suquark requested changes May 20, 2023

View reviewed changes

fastchat/serve/inference.py Outdated Show resolved Hide resolved

fastchat/serve/inference.py Outdated Show resolved Hide resolved

rename to partial_stop and add comment

dfde374

mingfang requested a review from suquark May 20, 2023 21:17

merrymercy requested changes May 20, 2023

View reviewed changes

fastchat/serve/inference.py Outdated Show resolved Hide resolved

fastchat/serve/inference.py Outdated Show resolved Hide resolved

refine partial_stop()

101438e

mingfang requested a review from merrymercy May 21, 2023 00:08

suquark requested changes May 21, 2023

View reviewed changes

fastchat/serve/inference.py Outdated Show resolved Hide resolved

mingfang added 2 commits May 21, 2023 00:31

avoid using same partial_stop name for both flag and function

3c40b62

make partial_stop() global

cc906f4

mingfang requested a review from suquark May 21, 2023 00:38

suquark requested changes May 21, 2023

View reviewed changes

fastchat/serve/inference.py Show resolved Hide resolved

default partially_stopped to False when stop_str is False

fde31b8

mingfang requested a review from suquark May 21, 2023 02:02

default partially_stopped to False

ffe382f

merrymercy approved these changes May 21, 2023

View reviewed changes

suquark approved these changes May 22, 2023

View reviewed changes

suquark merged commit 75d8ab2 into lm-sys:main May 22, 2023

pandada8 mentioned this pull request Dec 6, 2023

Prevent returning partial stop string in vllm worker #2780

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix stop detections #1392

fix stop detections #1392

mingfang commented May 20, 2023

suquark commented May 21, 2023

mingfang commented May 21, 2023 •

edited

Loading

merrymercy left a comment

suquark left a comment

plancktree commented Nov 21, 2023

fix stop detections #1392

fix stop detections #1392

Conversation

mingfang commented May 20, 2023

suquark commented May 21, 2023

mingfang commented May 21, 2023 • edited Loading

merrymercy left a comment

Choose a reason for hiding this comment

suquark left a comment

Choose a reason for hiding this comment

plancktree commented Nov 21, 2023

mingfang commented May 21, 2023 •

edited

Loading