Skip to content

Conversation

@njhill
Copy link
Member

@njhill njhill commented Oct 13, 2025

Simplify some data structures, redundant fields, etc.

Things noticed while reading through the code to understand it.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several beneficial refactorings to streamline the structured output related code. Key improvements include simplifying the structured_output_request_ids data structure from a dictionary to a list, making Request.use_structured_output a derived property for better state management, and improving the encapsulation of StructuredOutputRequest. These changes enhance code clarity and maintainability. I've identified one issue in gpu_model_runner.py where a conditional check is always true, which should be corrected.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Signed-off-by: Nick Hill <nhill@redhat.com>
@njhill njhill force-pushed the streamline-struct-output branch from fbab8e3 to ec7ceda Compare October 14, 2025 01:28
Signed-off-by: Nick Hill <nhill@redhat.com>
@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 14, 2025
Signed-off-by: Nick Hill <nhill@redhat.com>
@mergify mergify bot added the kv-connector label Oct 14, 2025
Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one minor comment suggestion, otherwise lgtm. thanks!

# Dict of request ids to their index within the batch
# for filling the next token bitmask
structured_output_request_ids: dict[str, int]
# ids of structured outputs requests included in the bitmask, in order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# ids of structured outputs requests included in the bitmask, in order.
# ids of structured outputs requests included in the bitmask.
# The index of the id in this list corresponds to the index into
# grammar_bitmask.

is this right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not necessarily in spec decoding case. The requests will be in the order of the list but there may be more than one row per request so the index in the list won't match index into the bitmask. It would be true in non-spec decode case though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right -- clarifying the order definition here would be nice in any case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this in follow-on PR to avoid running the whole CI again!

@njhill njhill enabled auto-merge (squash) October 14, 2025 22:45
@njhill njhill merged commit 4aed506 into vllm-project:main Oct 14, 2025
47 of 48 checks passed
@njhill njhill deleted the streamline-struct-output branch October 14, 2025 23:27
Jonahcb pushed a commit to Jonahcb/vllm that referenced this pull request Oct 15, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Jonah Bernard <jb2528@cornell.edu>
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
njhill added a commit to njhill/vllm that referenced this pull request Oct 17, 2025
This is a follow-on to PRs vllm-project#26737 and vllm-project#26961 to add some clarifying comments that were suggested in review.

Signed-off-by: Nick Hill <nhill@redhat.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
wangxiyuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Oct 24, 2025
### What this PR does / why we need it?
This is the step 1 of refactoring code to adapt with vllm main, and this
pr aligned with
vllm-project/vllm@17c540a

1. refactor deepseek to the latest code arch as of
vllm-project/vllm@17c540a
 
2. bunches of fixes due to vllm changes
- Fix `AscendScheduler` `__post_init__`, caused by
vllm-project/vllm#25075
- Fix `AscendScheduler` init got an unexpected arg `block_size`, caused
by vllm-project/vllm#26296
- Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by
vllm-project/vllm#23485
- Fix `MLAAttention` import,caused by
vllm-project/vllm#25103
- Fix `SharedFusedMoE` import, caused by
vllm-project/vllm#26145
- Fix `LazyLoader` improt, caused by
vllm-project/vllm#27022
- Fix `vllm.utils.swap_dict_values` improt, caused by
vllm-project/vllm#26990
- Fix `Backend` enum import, caused by
vllm-project/vllm#25893
- Fix `CompilationLevel` renaming to `CompilationMode` issue introduced
by vllm-project/vllm#26355
- Fix fused_moe ops, caused by
vllm-project/vllm#24097
- Fix bert model because of `inputs_embeds`, caused by
vllm-project/vllm#25922
- Fix MRope because of `get_input_positions_tensor` to
`get_mrope_input_positions`, caused by
vllm-project/vllm#24172
- Fix `splitting_ops` changes introduced by
vllm-project/vllm#25845
- Fix multi-modality changes introduced by
vllm-project/vllm#16229
- Fix lora bias dropping issue introduced by
vllm-project/vllm#25807
- Fix structured ouput break introduced by
vllm-project/vllm#26737

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Icey <1790571317@qq.com>
Co-authored-by: Icey <1790571317@qq.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…6737)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector ready ONLY add when PR is ready to merge/full CI is needed structured-output suppress-bc-linter tpu Related to Google TPUs v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants