Skip to content

Conversation

@noooop
Copy link
Collaborator

@noooop noooop commented Aug 25, 2025

TL;DR

New Model: GteNewModelForSequenceClassification

  • Alibaba-NLP/gte-multilingual-reranker-base
vllm serve Alibaba-NLP/gte-multilingual-reranker-base --hf-overrides '{"architectures": ["GteNewForSequenceClassification"]}' --trust_remote_code

The second-generation GTE model (mGTE-TRM) is named NewForSequenceClassification. The name NewForSequenceClassification is too generic, you should set --hf-overrides '{"architectures": ["GteNewForSequenceClassification"]}' to specify the use of the GteNewForSequenceClassification architecture.

Purpose

Fix #21595

Test Plan

pytest -s -vvv tests/models/language/pooling/test_gte.py::test_rerank_models_mteb[model_info1]

Test Result

pass

(Optional) Documentation Update


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

@mergify mergify bot added documentation Improvements or additions to documentation frontend new-model Requests to new models v1 labels Aug 25, 2025
noooop added 2 commits August 25, 2025 14:13
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
noooop added 2 commits August 25, 2025 14:15
Signed-off-by: wang.yuqi <noooop@126.com>
@mergify mergify bot added the qwen Related to Qwen models label Aug 25, 2025
Signed-off-by: wang.yuqi <noooop@126.com>
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split out the classification optimization into a separate PR?

@noooop
Copy link
Collaborator Author

noooop commented Aug 25, 2025

Can you split out the classification optimization into a separate PR?

The content is too small, so they are combined. you tend to separate them.


For the following period of time, I will profile and optimize different models.

Their optimizations are all small, but I don't want to wait until all optimizations are completed before submitting,

so the best way is to support similar models and hitchhike.

@DarkLight1337
Copy link
Member

The content is too small, so they are combined. you tend to separate them.

It's better to separate them to better isolate the issue in case CI breaks

@noooop
Copy link
Collaborator Author

noooop commented Aug 25, 2025

The content is too small, so they are combined. you tend to separate them.

It's better to separate them to better isolate the issue in case CI breaks

For the following period of time, I will profile and optimize different models.

Their optimizations are all small, but I don't want to wait until all optimizations are completed before submitting,

so the best way is to support similar models and hitchhike.

@DarkLight1337
Copy link
Member

For the refactoring of num_labels specifically I think that should be in a separate PR. The optimization inside pooler.py is small enough to keep in this PR.

@noooop
Copy link
Collaborator Author

noooop commented Aug 28, 2025

@DarkLight1337

Can we merge this PR first?

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, LGTM if tests pass

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 28, 2025 04:53
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 28, 2025
Signed-off-by: wang.yuqi <noooop@126.com>
auto-merge was automatically disabled August 28, 2025 05:08

Head branch was pushed to by a user without write access

@noooop
Copy link
Collaborator Author

noooop commented Aug 28, 2025

@DarkLight1337

Sorry for disable auto-merge.

@DarkLight1337 DarkLight1337 merged commit 11a7faf into vllm-project:main Aug 28, 2025
42 checks passed
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
@noooop noooop deleted the gte_seq_cls branch August 29, 2025 07:38
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend new-model Requests to new models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support GteNewModelForSequenceClassification

2 participants