Skip to content

Commit

Permalink
fix llama.cpp streaming bug
Browse files Browse the repository at this point in the history
  • Loading branch information
mobius committed Dec 26, 2023
1 parent 1e4199c commit 7e2edc8
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions uniteai/llm_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,14 +201,14 @@ def f(input_ids: torch.LongTensor,

stream = model(
request.text,
max_tokens=128,
max_tokens=200,
stream=True,
echo=False, # echo the prompt back as output
stopping_criteria=stopping_criteria,
)

for output in stream:
if output['choices'][0]['finish_reason'] == 'stop':
if output['choices'][0]['finish_reason'] in {'stop', 'length'}:
streamer.put(None)
else:
streamer.put(output['choices'][0]['text'])
Expand Down

0 comments on commit 7e2edc8

Please sign in to comment.