server : add option to time limit the generation phase #9865

ggerganov · 2024-10-12T12:54:12Z

The "t_max_predict_ms" parameter can be passed in completion and infill requests to optionally limit the time for the generation phase. The server will stop generating new tokens if the specified time is exceeded and a new-line character has already been generated.

curl \
    --request POST --url http://127.0.0.1:8013/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "int main(int argc, char", "n_predict": 999, "stream": true, "t_max_predict_ms": 500}'

ggml-ci

server : add option to time limit the generation phase

a34cde9

ggml-ci

github-actions bot added examples server labels Oct 12, 2024

ggerganov mentioned this pull request Oct 12, 2024

changelog : llama-server REST API #9291

Open

ggerganov merged commit edc2656 into master Oct 12, 2024
57 checks passed

ggerganov deleted the gg/server-time-limits branch October 12, 2024 13:14

drollings pushed a commit to drollings/llama.cpp that referenced this pull request Oct 18, 2024

server : add option to time limit the generation phase (ggerganov#9865)

3fd4de0

ggml-ci

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024

server : add option to time limit the generation phase (ggerganov#9865)

7d70e29

ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

server : add option to time limit the generation phase (ggerganov#9865)

9f62c60

ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

server : add option to time limit the generation phase (ggerganov#9865)

6722600

ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : add option to time limit the generation phase #9865

server : add option to time limit the generation phase #9865

ggerganov commented Oct 12, 2024

server : add option to time limit the generation phase #9865

server : add option to time limit the generation phase #9865

Conversation

ggerganov commented Oct 12, 2024