server : add completion mode (no chat) #3582

akx · 2023-10-11T13:03:36Z

This adds a simple completion-only mode to the server UI.

ggerganov

Very useful!

…example * 'master' of github.com:ggerganov/llama.cpp: (34 commits) examples: support LLaVA v1.5 (multimodal model) (ggml-org#3436) docs : fix typo GOMP_CPU_AFFINITY (ggml-org#3597) cmake : fix add_compile_options on macOS typo : it is `--n-gpu-layers` not `--gpu-layers` (ggml-org#3592) ci : check if there is enough VRAM (ggml-org#3596) server : add completion mode (no chat) (ggml-org#3582) prompts : add mnemonics.txt server : fix kv cache management (ggml-org#3588) main : fix session loading bug (ggml-org#3400) server : add parameter -tb N, --threads-batch N (ggml-org#3584) common : fix mirostat state when using multiple sequences (ggml-org#3543) batched : add bench tool (ggml-org#3545) examples : add batched.swift + improve CI for swift (ggml-org#3562) Add MPT model to supported models in README.md (ggml-org#3574) Minor improvements in GPT2 tokenizer (ggml-org#3567) readme : add bloom (ggml-org#3570) llm : add bloom models (ggml-org#3553) swift : improvements and fixes (ggml-org#3564) llm : add MPT support (ggml-org#3417) infill. : fix tokenization (ggml-org#3508) ...

server : add completion mode (no chat)

87a0361

ggerganov added the need feedback Testing and feedback with results are needed label Oct 11, 2023

ggerganov approved these changes Oct 12, 2023

View reviewed changes

ggerganov merged commit b016596 into ggml-org:master Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : add completion mode (no chat) #3582

server : add completion mode (no chat) #3582

akx commented Oct 11, 2023

ggerganov left a comment

server : add completion mode (no chat) #3582

server : add completion mode (no chat) #3582

Conversation

akx commented Oct 11, 2023

ggerganov left a comment

Choose a reason for hiding this comment