Skip to content

Conversation

@yiz-liu
Copy link
Collaborator

@yiz-liu yiz-liu commented May 22, 2025

Tweak packed_modules_mapping to support W8A8 weights.

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

…s_mapping

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
@yiz-liu yiz-liu force-pushed the feat-qwen3-quant branch from 457898c to 6fe32a7 Compare May 22, 2025 07:28
Comment on lines +33 to +34
"experts":
["experts.0.gate_proj", "experts.0.up_proj", "experts.0.down_proj"],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to add this, otherwise, weights exported by modelslim cannot be properly processed by vLLM.

@wangxiyuan
Copy link
Collaborator

I'm fine with this change for temporary quick fix. It's good to fix by modelslim for vllm upstream instead.
@22dimensions please check this issue with modelslim team. Thanks.
@MengqingCao for vllm upstream fix, we can take a try as well.

@wangxiyuan wangxiyuan added the ready read for review label May 22, 2025
@ganyi1996ppo ganyi1996ppo merged commit 17f05b1 into vllm-project:main May 23, 2025
16 checks passed
@yiz-liu yiz-liu deleted the feat-qwen3-quant branch May 23, 2025 07:53
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request May 30, 2025
Tweak packed_modules_mapping to support W8A8 weights.

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.

- Please clarify why the changes are needed. For instance, the use case
and bug description.

- Fixes #
-->

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Oct 16, 2025
Tweak packed_modules_mapping to support W8A8 weights.

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.

- Please clarify why the changes are needed. For instance, the use case
and bug description.

- Fixes #
-->

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
Tweak packed_modules_mapping to support W8A8 weights.

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.

- Please clarify why the changes are needed. For instance, the use case
and bug description.

- Fixes #
-->

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants