[Model] NemotronH Support #22349

danielafrimi · 2025-08-06T09:29:37Z

Heterogeneuous FFN support
Calculate head_dim - Changed from config.expand * config.hidden_size to config.mamba_num_heads * config.mamba_head_dim
Added support for explicit head_dim configuration parameter - Falls back to computed hidden_size // total_num_heads when head_dim is not specified (the models does not force head_dim = hidden_size // total_num_heads

gemini-code-assist

Code Review

This pull request adds support for heterogeneous FFN and explicit head_dim for Nemotron-H models. The changes are well-aligned with the description and seem correct. However, I've identified one critical issue where a parameter with a None default value is used in an operation that would cause a TypeError, which could lead to a runtime crash. I've provided a suggestion to make this parameter required, which should resolve the issue.

vllm/model_executor/models/nemotron_h.py

honghanhh · 2025-08-06T09:41:30Z

vllm/model_executor/models/nemotron_h.py

As layer_idx can be None, layer_idx + 1 will fail if it's None, adding a guard for layer_idx and ensure hybrid_override_pattern exists before using them would help.

thanks!
change it that now layer_idx cant be None.
for the config.hybrid_override_pattern we do assume that all nemtoronH will have this configuration.

github-actions · 2025-08-06T09:56:37Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

danielafrimi · 2025-08-10T14:34:18Z

CI fails on the a test, listed in CI Failures Dashboard - not cause by my changes

DarkLight1337 · 2025-08-10T15:44:44Z

Can you merge from main to fix Hybrid Models test? ~~Also add a test for your model in there~~

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> layer_idx cant be None Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> fix head_dim in config Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

DarkLight1337

Thanks for supporting this model in vLLM!

gournd · 2025-08-11T12:07:03Z

use vllm-ascend and DeepSeek -V3 -w8a8
(VllmWorker rank=2 pid=55990) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=3 pid=56719) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=0 pid=55435) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=3 pid=56785) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=0 pid=55436) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=1 pid=55699) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=1 pid=55700) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!
(VllmWorker rank=2 pid=55990) WARNING 08-11 11:50:32 [config.py:468] MoE DP setup unable to determine quantization scheme or unsupported quantization type. This model will not run with DP enabled.
(VllmWorker rank=3 pid=56719) WARNING 08-11 11:50:32 [config.py:468] MoE DP setup unable to determine quantization scheme or unsupported quantization type. This model will not run with DP enabled.
(VllmWorker rank=0 pid=55435) WARNING 08-11 11:50:32 [config.py:468] MoE DP setup unable to determine quantization scheme or unsupported quantization type. This model will not run with DP enabled.
(VllmWorker rank=2 pid=55989) INFO 08-11 11:50:32 [quantizer.py:85] Using the vLLM Ascend Quantizer version now!

DarkLight1337 · 2025-08-11T12:16:06Z

Looks like you commented on the wrong PR

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

gemini-code-assist bot reviewed Aug 6, 2025

View reviewed changes

vllm/model_executor/models/nemotron_h.py Outdated Show resolved Hide resolved

honghanhh suggested changes Aug 6, 2025

View reviewed changes

danielafrimi force-pushed the nemotron_nano branch from fa734a4 to 18cdba1 Compare August 10, 2025 11:47

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 10, 2025

DarkLight1337 requested a review from tlrmchlsmth August 10, 2025 14:43

nemotron changes

7d73434

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> layer_idx cant be None Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> fix head_dim in config Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

danielafrimi force-pushed the nemotron_nano branch from 18cdba1 to 7d73434 Compare August 11, 2025 08:37

DarkLight1337 approved these changes Aug 11, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) August 11, 2025 08:41

vllm-bot merged commit 14a5d90 into vllm-project:main Aug 11, 2025
38 of 44 checks passed

paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025

[Model] NemotronH Support (vllm-project#22349)

36fadfa

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025

[Model] NemotronH Support (vllm-project#22349)

e26552c

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Aug 19, 2025

[Model] NemotronH Support (vllm-project#22349)

2519649

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Model] NemotronH Support (vllm-project#22349)

391c537

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[Model] NemotronH Support (vllm-project#22349)

31b0db7

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Model] NemotronH Support (vllm-project#22349)

0b181d2

Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] NemotronH Support #22349

[Model] NemotronH Support #22349

Uh oh!

danielafrimi commented Aug 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

honghanhh Aug 6, 2025

Uh oh!

danielafrimi Aug 6, 2025

Uh oh!

github-actions bot commented Aug 6, 2025

Uh oh!

danielafrimi commented Aug 10, 2025

Uh oh!

DarkLight1337 commented Aug 10, 2025 •

edited

Loading

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

gournd commented Aug 11, 2025

Uh oh!

DarkLight1337 commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[Model] NemotronH Support #22349

[Model] NemotronH Support #22349

Uh oh!

Conversation

danielafrimi commented Aug 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

honghanhh Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

danielafrimi Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 6, 2025

Uh oh!

danielafrimi commented Aug 10, 2025

Uh oh!

DarkLight1337 commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gournd commented Aug 11, 2025

Uh oh!

DarkLight1337 commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielafrimi commented Aug 6, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Aug 10, 2025 •

edited

Loading