-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[New Model]: Support GteNewModelForSequenceClassification #23524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split out the classification optimization into a separate PR?
The content is too small, so they are combined. you tend to separate them. For the following period of time, I will profile and optimize different models. Their optimizations are all small, but I don't want to wait until all optimizations are completed before submitting, so the best way is to support similar models and hitchhike. |
It's better to separate them to better isolate the issue in case CI breaks |
For the following period of time, I will profile and optimize different models. Their optimizations are all small, but I don't want to wait until all optimizations are completed before submitting, so the best way is to support similar models and hitchhike. |
|
For the refactoring of |
Signed-off-by: wang.yuqi <noooop@126.com>
|
Can we merge this PR first? |
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, LGTM if tests pass
Head branch was pushed to by a user without write access
|
Sorry for disable auto-merge. |
…ct#23524) Signed-off-by: wang.yuqi <noooop@126.com>
…ct#23524) Signed-off-by: wang.yuqi <noooop@126.com>
…ct#23524) Signed-off-by: wang.yuqi <noooop@126.com>
…ct#23524) Signed-off-by: wang.yuqi <noooop@126.com>
TL;DR
New Model: GteNewModelForSequenceClassification
The second-generation GTE model (mGTE-TRM) is named
NewForSequenceClassification. The nameNewForSequenceClassificationis too generic, you should set--hf-overrides '{"architectures": ["GteNewForSequenceClassification"]}'to specify the use of theGteNewForSequenceClassificationarchitecture.Purpose
Fix #21595
Test Plan
pytest -s -vvv tests/models/language/pooling/test_gte.py::test_rerank_models_mteb[model_info1]
Test Result
pass
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.