- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
Description
Your current environment
vLLM installed with:
pip install https://wheels.vllm.ai/5536b30a4c7877d75758d21bdaf39b3a59aa2dc2/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
🐛 Describe the bug
After merging #16789, using "options" for guided decoding backends no longer works. Attempting to include a backend option results in:
$ vllm serve meta-llama/Llama-3.2-3B-Instruct --guided-decoding-backend xgrammar:disable-any-whitespace
INFO 04-22 18:45:12 [__init__.py:239] Automatically detected platform cuda.
usage: vllm serve [model_tag] [options]
vllm serve: error: argument --guided-decoding-backend: invalid choice: 'xgrammar:disable-any-whitespace' (choose from 'auto', 'outlines', 'lm-format-enforcer', 'xgrammar')
The new type checking of the args checks against a Literal type for the backend name, disallowing any options. For reference, backend options are briefly documented REF:
Additional backend-specific options can be supplied in a comma separated list following a colon after the backend name. For example "xgrammar:no-fallback" will not allow vLLM to fallback to a different backend on error.
Note that there are a few backend options that can be combined like guidance:disable-any-whitespace,no-fallback, so simply adding entries to the list of Literals seems untenable. I encountered this bug when writing up a PR to add another option #15949.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status