-
Notifications
You must be signed in to change notification settings - Fork 533
Closed
Labels
Description
The model to consider.
Qwen/Qwen3-8BQwen/Qwen3-MoE-15B-A2B
The closest model vllm already supports.
No response
What's your difficulty of supporting the model you want?
First priority:
- Add CI for
Qwen/Qwen3-0.6B: [CI] Add Qwen3-0.6B-Base test #717 - Download models and run test (functional / accuray / perf) for
Qwen/Qwen3-8B- functional: @wangxiyuan
- offline test
- online test
- accruracy: @hfadzxy
- perf: @@Potabk
- functional: @wangxiyuan
- Update
Single NPU (Qwen2.5 7B)toSingle NPU (Qwen3 8B): Update installation and tutorial doc #711 - Announcement on wechat post on Open Source Now:
使用vLLM Ascend 部署 Qwen3 - Validate on Qwen3 docker image:
- 0.8.4rc2
- 0.8.4rc2-openeuler
Second priority:
- Fix MOE error: [Model] Support common fused moe ops for moe model #709
- Add CI for
Qwen/Qwen3-MoE-15B-A2B - Download models and run test (functional / accuray / perf) for
Qwen/Qwen3-MoE-15B-A2B - Announcement on wechat post on Open Source Now:
使用 vLLM Ascend 部署 Qwen/Qwen3-MoE