[Model][Ouro] Support Ouro Model #27794

FlamingoPg · 2025-10-30T06:09:59Z

Purpose

This PR is to support the Ouro model inference

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

github-actions · 2025-10-30T06:10:09Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This PR adds support for the Ouro model. The implementation is largely adapted from the Qwen2 model. However, there are critical issues with the implementation of pipeline parallelism, which is currently broken. Additionally, a key feature of the Ouro model, early exiting, is initialized but not implemented in the forward pass. These issues need to be addressed to ensure correctness and full feature support.

vllm/model_executor/models/ouro.py

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

vllm/model_executor/models/ouro.py

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

jeejeelee · 2025-10-30T09:10:15Z

vllm/model_executor/models/ouro.py

+            ["hidden_states", "residual"], config.hidden_size
+        )
+        self.norm = RMSNorm(config.hidden_size, eps=config.rms_norm_eps)
+        self.early_exit_gate = RowParallelLinear(config.hidden_size, 1, bias=True)


~~It looks like this linear is unused~~
Maybe this PR is WIP

True, we not use early exit in vllm. It's hard to handle kv cache here.

vllm/model_executor/models/ouro.py

jeejeelee · 2025-10-30T09:19:24Z

Don't forget update model doc and model test

FlamingoPg · 2025-10-30T09:45:33Z

Don't forget update model doc and model test

Thanks for review!

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

mergify · 2025-10-30T10:05:34Z

Documentation preview: https://vllm--27794.org.readthedocs.build/en/27794/

docs/models/supported_models.md

Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: youkaichao <youkaichao@gmail.com>

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com> Co-authored-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>

mergify bot added the new-model Requests to new models label Oct 30, 2025

gemini-code-assist bot reviewed Oct 30, 2025

View reviewed changes

vllm/model_executor/models/ouro.py Show resolved Hide resolved

vllm/model_executor/models/ouro.py Show resolved Hide resolved

FlamingoPg force-pushed the ouro branch from c3127cb to ca76cd3 Compare October 30, 2025 06:28

FlamingoPg and others added 5 commits October 30, 2025 14:30

init ouro model

2175622

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

Delete ouro_test directory

878db0c

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

fix lint

d428a78

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

del pp

3c13a80

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

del useless

1af252c

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

FlamingoPg force-pushed the ouro branch from ca76cd3 to 1af252c Compare October 30, 2025 06:30

youkaichao reviewed Oct 30, 2025

View reviewed changes

vllm/model_executor/models/ouro.py Outdated Show resolved Hide resolved

upd total_ut_step define

1b0687f

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

jeejeelee reviewed Oct 30, 2025

View reviewed changes

vllm/model_executor/models/ouro.py Outdated Show resolved Hide resolved

jeejeelee reviewed Oct 30, 2025

View reviewed changes

vllm/model_executor/models/ouro.py Outdated Show resolved Hide resolved

upd model doc & model test

90e9700

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

FlamingoPg requested review from DarkLight1337 and ywang96 as code owners October 30, 2025 10:04

mergify bot added the documentation Improvements or additions to documentation label Oct 30, 2025

Merge branch 'main' into ouro

257bdd5

jeejeelee reviewed Oct 30, 2025

View reviewed changes

docs/models/supported_models.md Outdated Show resolved Hide resolved

youkaichao and others added 3 commits October 30, 2025 19:05

Update docs/models/supported_models.md

f3751fc

Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: youkaichao <youkaichao@gmail.com>

fix

635324b

Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>

Merge branch 'main' into ouro

5c47ffb

jeejeelee approved these changes Oct 30, 2025

View reviewed changes

youkaichao merged commit 9956aae into vllm-project:main Oct 30, 2025
4 of 5 checks passed

Kay-Tian mentioned this pull request Oct 30, 2025

vLLM PR #27794 变更核心文件提醒 Kay-Tian/vllm#64

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model][Ouro] Support Ouro Model #27794

[Model][Ouro] Support Ouro Model #27794

Uh oh!

FlamingoPg commented Oct 30, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeejeelee Oct 30, 2025 •

edited

Loading

Uh oh!

FlamingoPg Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Oct 30, 2025

Uh oh!

FlamingoPg commented Oct 30, 2025

Uh oh!

mergify bot commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Model][Ouro] Support Ouro Model #27794

[Model][Ouro] Support Ouro Model #27794

Uh oh!

Conversation

FlamingoPg commented Oct 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeejeelee Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FlamingoPg Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Oct 30, 2025

Uh oh!

FlamingoPg commented Oct 30, 2025

Uh oh!

mergify bot commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FlamingoPg commented Oct 30, 2025 •

edited by github-actions bot

Loading

jeejeelee Oct 30, 2025 •

edited

Loading