Skip to content

[Feature] Apply structured output sampling after reasoning steps in Reasoning models #4055

@xihuai18

Description

@xihuai18

Checklist

Motivation

Only apply constrained sampling only in the answer for reasoning model. i.e. for DeepSeek R1 only enforce grammar inside after </think>
This would make Reasoning models more useful in agent workflow expecting structured output.

Related resources

vllm-project/vllm#12619
vllm-project/vllm#12955

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions