[Perf] Optimize perf of Qwen3 #1245

rjg-lyh · 2025-06-16T13:48:37Z

What this PR does / why we need it?

Optimize the performance of Qwen3 model by registering a custom model.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI passed with existing test.

ttanzhiqiang · 2025-06-17T10:05:22Z

Can you add e2e testing? I want to try it out.

github-actions · 2025-06-20T16:04:23Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: rjg-lyh <1318825571@qq.com>

MengqingCao · 2025-06-26T01:26:38Z

vllm_ascend/ops/layernorm.py



+class AddRMSNormQuant(RMSNorm):
+    """Root mean square normalization.


Please update the comment

MengqingCao · 2025-06-26T01:32:37Z

vllm_ascend/models/qwen3.py

+            self.post_attention_layernorm = RMSNorm(config.hidden_size,
+                                                    eps=config.rms_norm_eps)
+        else:
+            from vllm_ascend.quantization.quant_config import AscendQuantConfig


It seems the mainly changes on CustomQwen3DecoderLayer is the AddRMSNormQuant layer. I prefer to inheiret from Qwen3DecoderLayer and add the logic of AddRMSNormQuant. This could make the optimization point clearly and reduce redundant code

MengqingCao · 2025-06-26T01:42:11Z

vllm_ascend/ops/layernorm.py

+        import torch_npu
+
+        if residual is not None:
+            x, _, residual = torch_npu.npu_add_rms_norm_quant(x, residual, self.weight,


QQ: what does "add" mean here?

github-actions · 2025-06-28T10:53:19Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Yikun · 2025-07-06T10:20:31Z

Yes, we should avoid to add big paste code in vllm-ascend. Please also paste perf results here.

github-actions bot added module:ops module:quantization labels Jun 16, 2025

rjg-lyh force-pushed the pr-perf-optim branch from ec4fc9e to 39d4801 Compare June 17, 2025 08:58

github-actions bot added the merge-conflicts label Jun 20, 2025

rjg-lyh force-pushed the pr-perf-optim branch from 39d4801 to a44c7a3 Compare June 25, 2025 11:21

Optimize perf of Qwen3

ad4391a

Signed-off-by: rjg-lyh <1318825571@qq.com>

rjg-lyh force-pushed the pr-perf-optim branch from a44c7a3 to ad4391a Compare June 25, 2025 11:29

github-actions bot removed the merge-conflicts label Jun 25, 2025

MengqingCao reviewed Jun 26, 2025

View reviewed changes

github-actions bot added the merge-conflicts label Jun 28, 2025

Yikun mentioned this pull request Jul 8, 2025

vLLM Ascend Roadmap Q3 2025 #1168

Closed

45 tasks

rjg-lyh closed this Jul 22, 2025

rjg-lyh deleted the pr-perf-optim branch July 22, 2025 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Perf] Optimize perf of Qwen3 #1245

[Perf] Optimize perf of Qwen3 #1245

Uh oh!

rjg-lyh commented Jun 16, 2025 •

edited

Loading

Uh oh!

ttanzhiqiang commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 20, 2025

Uh oh!

MengqingCao Jun 26, 2025

Uh oh!

MengqingCao Jun 26, 2025

Uh oh!

MengqingCao Jun 26, 2025

Uh oh!

github-actions bot commented Jun 28, 2025

Uh oh!

Yikun commented Jul 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants



		class AddRMSNormQuant(RMSNorm):
		"""Root mean square normalization.

[Perf] Optimize perf of Qwen3 #1245

[Perf] Optimize perf of Qwen3 #1245

Uh oh!

Conversation

rjg-lyh commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ttanzhiqiang commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 20, 2025

Uh oh!

MengqingCao Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 28, 2025

Uh oh!

Yikun commented Jul 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rjg-lyh commented Jun 16, 2025 •

edited

Loading