[Model] Add support for ModernBertForTokenClassification #26340

antrec · 2025-10-07T07:44:06Z

Purpose

Add support for ModernBertForTokenClassification.
I got inspired from #24872 , and adapted to ModernBert architecture. I did not touch to flex attention backend or anything, though, because I expected the changes already done in that previous PR to make BERT work for NER, would work for ModernBert as well.

Test Plan

Added a single test case in tests/models/language/pooling/test_token_classification.py, that checks the results of huggingface and vllm runners for disham993/electrical-ner-ModernBERT-base are almost equal. I took a "random" model, as little as possible (there are not many modernbert for NER available on huggingface).

Test Result

The test passes.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

github-actions · 2025-10-07T07:44:17Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request adds support for ModernBertForTokenClassification. The changes include the model implementation, registration, and a new test case. The implementation of ModernBertForTokenClassification is missing the application of the dropout layer, which is a correctness issue. Additionally, the new test case for this model has a bug in its result-checking loop that will cause it to fail. I've provided suggestions to fix both of these critical issues.

tests/models/language/pooling/test_token_classification.py

vllm/model_executor/models/modernbert.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

chatgpt-codex-connector · 2025-10-07T07:47:10Z

tests/models/language/pooling/test_token_classification.py

+    # check logits difference
+    for hf_output, vllm_output in enumerate(zip(hf_outputs, vllm_outputs)):
+        hf_output = torch.tensor(hf_output).cpu().float()
+        vllm_output = torch.tensor(vllm_output).cpu().float()


Token classification test compares index tensor to model output

The ModernBERT token classification test iterates with enumerate(zip(hf_outputs, vllm_outputs)), so hf_output is the loop index and vllm_output is a tuple of tensors. The subsequent conversion to tensors (torch.tensor(hf_output) / torch.tensor(vllm_output)) therefore attempts to turn an integer and a tuple of tensors into tensors, raising a TypeError before the assertion can run. This means the new test fails regardless of model correctness. The loop should iterate over zip(hf_outputs, vllm_outputs) directly so that the actual logits are compared.

Useful? React with 👍 / 👎.

DarkLight1337 · 2025-10-07T08:08:35Z

@noooop would be great if you could help review this!

noooop · 2025-10-07T08:25:33Z

vllm/model_executor/models/modernbert.py

+            inputs_embeds=inputs_embeds,
+            intermediate_tensors=intermediate_tensors,
+        )
+        hidden_states = self.drop(self.head(hidden_states))


Thanks for your contribution.

Clear and clean, with only a few details needing modification.

For the vLLM inference framework, there is no need to implement dropout.

Thanks! I have made a commit to remove the dropout layer from the model, and dismiss its weight loading in load_weights.

noooop · 2025-10-07T11:19:36Z

LGTM

cc @DarkLight1337

DarkLight1337

Thanks both of you!

DarkLight1337 · 2025-10-07T12:22:30Z

Actually, can you also update the Supported Models page docs/models/supported_models.md?

Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: antrec <antoine.recanati@gmail.com>

Added dropout layer in forward, which should not have effect at inference, though, but I still kept it in case, and to match huggingface architecture so that weight loading is simplified. Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: antrec <antoine.recanati@gmail.com>

Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

DarkLight1337 · 2025-10-09T04:16:53Z

The test fails in https://buildkite.com/vllm/ci/builds/33941/steps/canvas?sid=0199c1fa-d5a7-40a2-8f02-b45545380a3b, can you fix it?

noooop · 2025-10-09T04:23:35Z

The test fails in https://buildkite.com/vllm/ci/builds/33941/steps/canvas?sid=0199c1fa-d5a7-40a2-8f02-b45545380a3b, can you fix it?

fix in #26414

antrec · 2025-10-09T08:24:23Z

The test fails in https://buildkite.com/vllm/ci/builds/33941/steps/canvas?sid=0199c1fa-d5a7-40a2-8f02-b45545380a3b, can you fix it?

fix in #26414

Thanks! So, there is nothing to do on my side?

The test passed locally for me before committing, but I did not manage to have the exact same setup than in the CI on my Macbook (I did not use VLLM_USE_PRECOMPILED=1, and used torch==2.8.0 instead of torch==2.8.0+cpu), sorry if I missed something because of that.

noooop · 2025-10-09T08:31:36Z

The test fails in https://buildkite.com/vllm/ci/builds/33941/steps/canvas?sid=0199c1fa-d5a7-40a2-8f02-b45545380a3b, can you fix it?

fix in #26414

Thanks! So, there is nothing to do on my side?

The test passed locally for me before committing, but I did not manage to have the exact same setup than in the CI on my Macbook (I did not use VLLM_USE_PRECOMPILED=1, and used torch==2.8.0 instead of torch==2.8.0+cpu), sorry if I missed something because of that.

There are minor differences in numerical precision across different devices. It’s okay; this has been fixed by #26466.

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…t#26340) Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Signed-off-by: antrec <antoine.recanati@gmail.com> Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

antrec requested review from DarkLight1337 and ywang96 as code owners October 7, 2025 07:44

mergify bot added the new-model Requests to new models label Oct 7, 2025

gemini-code-assist bot reviewed Oct 7, 2025

View reviewed changes

tests/models/language/pooling/test_token_classification.py Outdated Show resolved Hide resolved

vllm/model_executor/models/modernbert.py Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Oct 7, 2025

View reviewed changes

noooop reviewed Oct 7, 2025

View reviewed changes

antrec force-pushed the add-modernbert-token-classification branch from 1699421 to 303edff Compare October 7, 2025 11:10

DarkLight1337 approved these changes Oct 7, 2025

View reviewed changes

santrecare and others added 5 commits October 7, 2025 14:34

[New Model] Support ModernBertForTokenClassification

0b4f32f

Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>

Update tests/models/language/pooling/test_token_classification.py

deeec4d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: antrec <antoine.recanati@gmail.com>

Removed dropout from ModernBertForTokenClassification

6457753

Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>

updated the Supported Models page

5a839fd

Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>

antrec force-pushed the add-modernbert-token-classification branch from 1c21935 to 5a839fd Compare October 7, 2025 12:36

mergify bot added the documentation Improvements or additions to documentation label Oct 7, 2025

DarkLight1337 enabled auto-merge (squash) October 7, 2025 12:39

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 7, 2025

DarkLight1337 merged commit 6f59bea into vllm-project:main Oct 7, 2025
55 checks passed

Uh oh!

[Model] Add support for ModernBertForTokenClassification #26340

[Model] Add support for ModernBertForTokenClassification #26340

Uh oh!

Conversation

antrec commented Oct 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Oct 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Oct 7, 2025

Uh oh!

noooop Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

antrec Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

noooop commented Oct 7, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Oct 7, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Oct 9, 2025

Uh oh!

noooop commented Oct 9, 2025

Uh oh!

antrec commented Oct 9, 2025

Uh oh!

noooop commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

antrec commented Oct 7, 2025 •

edited by github-actions bot

Loading

noooop commented Oct 9, 2025 •

edited

Loading