Skip to content

Conversation

@MatthewBonanni
Copy link
Contributor

@MatthewBonanni MatthewBonanni commented Sep 25, 2025

Purpose

Remove _is_v1_supported_oracle and corresponding test

Test Plan

CI should suffice

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@mergify mergify bot added the ci/build label Sep 25, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the _is_v1_supported_oracle function and its usage, which handled automatic fallbacks to the V0 engine. With this change, the V1 engine becomes the default unless explicitly disabled with VLLM_USE_V1=0. This aligns with the goal of deprecating V0. The changes are clean, and the corresponding test removal from the CI pipeline is appropriate. I find the implementation to be correct and have no further comments.

Comment on lines -1467 to -1479
# V1 supports N-gram, Medusa, and Eagle speculative decoding.
if self.speculative_config is not None:
# speculative_config could still be a dict at this point
if isinstance(self.speculative_config, dict):
method = self.speculative_config.get("method", None)
else:
method = self.speculative_config.method

if method == "draft_model":
raise NotImplementedError(
"Draft model speculative decoding is not supported yet. "
"Please consider using other speculative decoding methods "
"such as ngram, medusa, eagle, or deepseek_mtp.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think _is_v1_supported_oracle is still useful to exclude some unsupported arguments here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could make an _is_config_supported_oracle, which doesn't dictate VLLM_USE_V1, it just throws errors if unsupported.

Alternatively, it might be better if this check happens in the speculative initialization rather than at an engine level?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could make an _is_config_supported_oracle, which doesn't dictate VLLM_USE_V1, it just throws errors if unsupported.

Agree.

Alternatively, it might be better if this check happens in the speculative initialization rather than at an engine level?

In fact, these arguments are ever supported in v0 but haven't been implemented (some of them are deprecated) in v1. So I prefer to keep them here for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it's good raise NotImplementedError error instead of removing the check totally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants