Skip to content

Commit 98060b0

Browse files
[Feature][Frontend]: Deprecate --enable-reasoning (#17452)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
1 parent f5a3c65 commit 98060b0

16 files changed

+49
-91
lines changed

docs/source/features/reasoning_outputs.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,10 @@ vLLM currently supports the following reasoning models:
2121

2222
## Quickstart
2323

24-
To use reasoning models, you need to specify the `--enable-reasoning` and `--reasoning-parser` flags when making a request to the chat completion endpoint. The `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output.
24+
To use reasoning models, you need to specify the `--reasoning-parser` flags when making a request to the chat completion endpoint. The `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output.
2525

2626
```bash
27-
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
28-
--enable-reasoning --reasoning-parser deepseek_r1
27+
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
2928
```
3029

3130
Next, make a request to the model that should return the reasoning content in the response.
@@ -140,8 +139,7 @@ Remember to check whether the `reasoning_content` exists in the response before
140139
The reasoning content is also available in the structured output. The structured output engine like `xgrammar` will use the reasoning content to generate structured output. It is only supported in v0 engine now.
141140

142141
```bash
143-
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
144-
--enable-reasoning --reasoning-parser deepseek_r1
142+
VLLM_USE_V1=0 vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1
145143
```
146144

147145
Please note that the `VLLM_USE_V1` environment variable must be set to `0` to use the v0 engine.
@@ -316,9 +314,8 @@ class DeepSeekReasoner(Reasoner):
316314

317315
The structured output engine like `xgrammar` will use `end_token_id` to check if the reasoning content is present in the model output and skip the structured output if it is the case.
318316

319-
Finally, you can enable reasoning for the model by using the `--enable-reasoning` and `--reasoning-parser` flags.
317+
Finally, you can enable reasoning for the model by using the `--reasoning-parser` flags.
320318

321319
```bash
322-
vllm serve <model_tag> \
323-
--enable-reasoning --reasoning-parser example
320+
vllm serve <model_tag> --reasoning-parser example
324321
```

examples/online_serving/openai_chat_completion_structured_outputs_with_reasoning.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
1010
```bash
1111
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
12-
--enable-reasoning --reasoning-parser deepseek_r1
12+
--reasoning-parser deepseek_r1
1313
```
1414
1515
This example demonstrates how to generate chat completions from reasoning models

examples/online_serving/openai_chat_completion_tool_calls_with_reasoning.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
1010
```bash
1111
vllm serve Qwen/QwQ-32B \
12-
--enable-reasoning --reasoning-parser deepseek_r1 \
12+
--reasoning-parser deepseek_r1 \
1313
--enable-auto-tool-choice --tool-call-parser hermes
1414
1515
```

examples/online_serving/openai_chat_completion_with_reasoning.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
99
```bash
1010
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
11-
--enable-reasoning --reasoning-parser deepseek_r1
11+
--reasoning-parser deepseek_r1
1212
```
1313
1414
This example demonstrates how to generate chat completions from reasoning models

examples/online_serving/openai_chat_completion_with_reasoning_streaming.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
99
```bash
1010
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B \
11-
--enable-reasoning --reasoning-parser deepseek_r1
11+
--reasoning-parser deepseek_r1
1212
```
1313
1414
Unlike openai_chat_completion_with_reasoning.py, this example demonstrates the

tests/entrypoints/openai/test_chat_with_tool_reasoning.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@
1313
@pytest.fixture(scope="module")
1414
def server(): # noqa: F811
1515
args = [
16-
"--max-model-len", "8192", "--enforce-eager", "--enable-reasoning",
17-
"--reasoning-parser", "deepseek_r1", "--enable-auto-tool-choice",
18-
"--tool-call-parser", "hermes"
16+
"--max-model-len", "8192", "--enforce-eager", "--reasoning-parser",
17+
"deepseek_r1", "--enable-auto-tool-choice", "--tool-call-parser",
18+
"hermes"
1919
]
2020

2121
with RemoteOpenAIServer(MODEL_NAME, args) as remote_server:

tests/entrypoints/openai/test_cli_args.py

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -122,31 +122,23 @@ def test_enable_auto_choice_fails_with_enable_reasoning(serve_parser):
122122
"""Ensure validation fails if reasoning is enabled with auto tool choice"""
123123
args = serve_parser.parse_args(args=[
124124
"--enable-auto-tool-choice",
125-
"--enable-reasoning",
125+
"--reasoning-parser",
126+
"deepseek_r1",
126127
])
127128
with pytest.raises(TypeError):
128129
validate_parsed_serve_args(args)
129130

130131

131-
def test_enable_reasoning_passes_with_reasoning_parser(serve_parser):
132+
def test_passes_with_reasoning_parser(serve_parser):
132133
"""Ensure validation passes if reasoning is enabled
133134
with a reasoning parser"""
134135
args = serve_parser.parse_args(args=[
135-
"--enable-reasoning",
136136
"--reasoning-parser",
137137
"deepseek_r1",
138138
])
139139
validate_parsed_serve_args(args)
140140

141141

142-
def test_enable_reasoning_fails_without_reasoning_parser(serve_parser):
143-
"""Ensure validation fails if reasoning is enabled
144-
without a reasoning parser"""
145-
args = serve_parser.parse_args(args=["--enable-reasoning"])
146-
with pytest.raises(TypeError):
147-
validate_parsed_serve_args(args)
148-
149-
150142
def test_chat_template_validation_for_happy_paths(serve_parser):
151143
"""Ensure validation passes if the chat template exists"""
152144
args = serve_parser.parse_args(

vllm/config.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3225,10 +3225,9 @@ def guided_decoding_backend(self, value: GuidedDecodingBackend):
32253225
in the JSON schema. This is only supported for the `guidance` backend and
32263226
is used to better align its behaviour with `outlines` and `xgrammar`."""
32273227

3228-
reasoning_backend: Optional[str] = None
3228+
reasoning_backend: str = ""
32293229
"""Select the reasoning parser depending on the model that you're using.
3230-
This is used to parse the reasoning content into OpenAI API format.
3231-
Required for `--enable-reasoning`."""
3230+
This is used to parse the reasoning content into OpenAI API format."""
32323231

32333232
def compute_hash(self) -> str:
32343233
"""

vllm/engine/arg_utils.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -365,8 +365,9 @@ class EngineArgs:
365365
calculate_kv_scales: bool = CacheConfig.calculate_kv_scales
366366

367367
additional_config: Optional[Dict[str, Any]] = None
368-
enable_reasoning: Optional[bool] = None
369-
reasoning_parser: Optional[str] = DecodingConfig.reasoning_backend
368+
enable_reasoning: Optional[bool] = None # DEPRECATED
369+
reasoning_parser: str = DecodingConfig.reasoning_backend
370+
370371
use_tqdm_on_load: bool = LoadConfig.use_tqdm_on_load
371372

372373
def __post_init__(self):
@@ -798,8 +799,15 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
798799
"--enable-reasoning",
799800
action="store_true",
800801
default=False,
801-
help="Whether to enable reasoning_content for the model. "
802-
"If enabled, the model will be able to generate reasoning content."
802+
help=
803+
"[DEPRECATED] " \
804+
"The --enable-reasoning flag is deprecated as of v0.8.6. "
805+
"Use --reasoning-parser to specify " \
806+
"the reasoning parser backend instead. "
807+
"This flag (--enable-reasoning) will be " \
808+
"removed in v0.10.0. "
809+
"When --reasoning-parser is specified, " \
810+
"reasoning mode is automatically enabled."
803811
)
804812

805813
return parser
@@ -1088,7 +1096,6 @@ def create_engine_config(
10881096
disable_additional_properties=\
10891097
self.guided_decoding_disable_additional_properties,
10901098
reasoning_backend=self.reasoning_parser
1091-
if self.enable_reasoning else None,
10921099
)
10931100

10941101
observability_config = ObservabilityConfig(

vllm/engine/llm_engine.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2096,7 +2096,7 @@ def _build_logits_processors(
20962096
guided_decoding.backend = guided_decoding.backend or \
20972097
self.decoding_config.backend
20982098

2099-
if self.decoding_config.reasoning_backend is not None:
2099+
if self.decoding_config.reasoning_backend:
21002100
logger.debug("Building with reasoning backend %s",
21012101
self.decoding_config.reasoning_backend)
21022102

0 commit comments

Comments
 (0)