-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
[Model]: Fused MoE for nomic-embed-text-v2-moe #18321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
cc @noooop can you help test this? |
|
Sorry, I'm going outside currently and can only go back for testing on the 27th |
|
Since this model already has a test in CI, I'll just unblock it and see if it passes |
|
https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/discussions/37 Fix: Respect is_causal=False config in forward to enable bidirectional attention The test failure is related to this, and after a year, it has finally been fixed. |
Signed-off-by: Isotr0py <2037008807@qq.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
sorry for late response language-models-test-extended has verified the model's results on mteb/STS12 I tested a larger mteb/T2Reranking dataset. Supporting long context requires #18755 main: 8192 this pr: 8192 LGTM |
Signed-off-by: Isotr0py <2037008807@qq.com>
Yes, I have considered |
|
Any update on this? It would help with getting the model to support V1 with torch.compile |
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
Let me update this PR to catch recent MoE refactoring. |
|
I have confirmed |
|
Thanks for the quick update! |
|
buildkite/ci/pr/language-models-test-extended-pooling Failed tests can be fixed by #21747 |
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: x22x22 <wadeking@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Noam Gat <noamgat@gmail.com>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Paul Pak <paulpak58@gmail.com>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
fused_moe.