- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
[Frontend] Expose custom args in OpenAI APIs #16862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Expose custom args in OpenAI APIs #16862
Conversation
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
| 👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run  Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add  🚀 | 
| Thanks @afeldman-nm! It would be good to include a test that shows how these can be passed via the OpenAI client sdk using its  I'm unsure whether we want these new custom args to be in a nested json object (as you've done here) or just extra top-level args. | 
| Thanks @njhill . Agree regarding the unit test. I need to think a bit about the right way to do it | 
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
| Thanks for your review @comaniac . After chatting with Cody, I think this interface change is sufficiently impactful to merit an RFC which I will write and share shortly. | 
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
| 
 Hi @helloworld1 - working on getting this PR landed as-is | 
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm LGTM, just a couple of minor comments.
Co-authored-by: Nick Hill <nhill@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm
Add a
vllm_xargs: Optional[dict[str, Union[str,int,float]]]field toCompletionRequest,ChatCompletionRequestandTranscriptionRequest(these are the only OpenAIBaseModel subclasses which had alogits_processorsfield in v0.) This field is injected intoSamplingParams.extra_argsviaSamplingParams.from_optional(); each dict key/value pair inextra_argsbecomes an assignment to an attribute ofsampling_params.Purpose
Enable extensible features such as logits processors and plugins to receive arbitrary custom arguments via the REST API. Mirror
SamplingParams.extra_argsin the REST API.Test plan
Does not require additional unit tests (when logitsprocs extensibility is introduced later, this will implicitly test custom args). Pre-existing unit tests must pass so we know pre-existing features are not being broken.
Test results
N/A
Documentation changes
SamplingParamsdocstring clarifies thatextra_argsmay plumb custom args to logitsprocs, plugins, etc. (previous it just said logitsprocs)vllm/entrypoints/openai/protocol.py: forCompletionRequest,ChatCompletionRequest, andTranscriptionRequest, movevllm_xargsdefinition inside the# --8<-- [start:completion-extra-params]section and add more detail to the description string.Final note
The pre-existing behavior of
protocol.pyChatCompletionRequestandCompletionRequestis thatkv_transfer_paramsis passed into the engine viaSamplingParams.extra_args; this PR simply mergesvllm_xargsintoSamplingParams.extra_argsalongsidekv_transfer_params. In the future it may be worth considering whetherSamplingParams.extra_argsis the best pathway for plumbingkv_transfer_paramsinto the engine; it would seem to break the convention thatSamplingParams.extra_argsis not intended for "in-tree" functionality.RFC: #17191
Fixes #16802