Closed
Description
What would you like to be added:
We have supported vllm, since llama.cpp adds this feature, we should support it as well, see ggml-org/llama.cpp#10455
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
- Design doc
- API change
- Docs update
The artifacts should be linked in subsequent comments.