Skip to content

Conversation

@depeng1994
Copy link
Contributor

What this PR does / why we need it?

We need to observe the time consumed in each stage of inference (including pre-processing, model forward, etc.), without any performance loss.
Therefore, we use the event timestamp mechanism of the NPU to mark any stage during the execution of the NPU device (this marking operation is executed asynchronously, with no performance loss).
Additionally, we provide a blocking synchronization API pop_captured_sync to be called at an appropriate time, to print the time consumed in all observed stages.

model_runner_v1.py file only changed 5 lines, all of which were ProfileExecuteDuration() calls, and nothing else was changed, while more changes were showed due to the alignment issue.

Does this PR introduce any user-facing change?

Use env VLLM_MODEL_EXECUTE_TIME_OBSERVE to enable this feature

How was this patch tested?

Tested in deepseek model,Print like this:

5691:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.17ms [prepare input and forward]:9.57ms [forward]:4.14ms
5695:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.29ms [prepare input and forward]:10.19ms [forward]:4.14ms
5697:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.81ms [prepare input and forward]:10.29ms [forward]:3.99ms
5701:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.10ms [prepare input and forward]:10.62ms [forward]:4.33ms
5705:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.65ms [prepare input and forward]:9.58ms [forward]:4.20ms
5709:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.43ms [prepare input and forward]:9.88ms [forward]:4.20ms
5711:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.89ms [prepare input and forward]:10.49ms [forward]:4.19ms
5715:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.14ms [prepare input and forward]:11.21ms [forward]:4.18ms
5719:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.71ms [prepare input and forward]:10.15ms [forward]:4.42ms
5723:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.31ms [forward]:4.25ms
5725:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.12ms [prepare input and forward]:10.33ms [forward]:4.24ms
5729:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.58ms [prepare input and forward]:10.85ms [forward]:4.32ms
5733:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.32ms [prepare input and forward]:9.79ms [forward]:4.28ms
5737:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:15.06ms [prepare input and forward]:9.89ms [forward]:4.32ms
5739:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.48ms [forward]:4.27ms
5743:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.60ms [prepare input and forward]:10.71ms [forward]:4.61ms
5747:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.21ms [prepare input and forward]:10.10ms [forward]:4.52ms
5751:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:15.03ms [prepare input and forward]:10.00ms [forward]:4.42ms

@MengqingCao
Copy link
Collaborator

Overall lgtm, just some suggestions:

  • Thanks for the detail pr description, could you please convert it to a tutorial for this feature? We can put it in docs/source/developer_guide/profile_execute_duration_observation.md
  • Could you add ut for this feature?
  • Plz run bash format.sh locally to fix lint failures, and sign off your commit by commit -sm "your commit message" to fix DCO

@depeng1994
Copy link
Contributor Author

@wangxiyuan @Yikun @ganyi1996ppo please take a look, tks

Signed-off-by: depeng1994 <depengzhang@foxmail.com>
Signed-off-by: depeng1994 <depengzhang@foxmail.com>
@github-actions github-actions bot added documentation Improvements or additions to documentation merge-conflicts labels May 30, 2025
@github-actions
Copy link

github-actions bot commented Jun 3, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: depeng1994 <depengzhang@foxmail.com>
Signed-off-by: depeng1994 <depengzhang@foxmail.com>
@depeng1994 depeng1994 force-pushed the main branch 3 times, most recently from 25ee163 to 9967032 Compare June 3, 2025 12:49
Signed-off-by: depeng1994 <depengzhang@foxmail.com>
@github-actions
Copy link

github-actions bot commented Jun 4, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

github-actions bot commented Jun 4, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

lambda: bool(int(os.getenv("COMPILE_CUSTOM_KERNELS", "1"))),
"VLLM_ENABLE_MC2":
lambda: bool(int(os.getenv("VLLM_ENABLE_MC2", '0'))),
"VLLM_MODEL_EXECUTE_TIME_OBSERVE":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"VLLM_MODEL_EXECUTE_TIME_OBSERVE":
"VLLM_ASCEND_MODEL_EXECUTE_TIME_OBSERVE":

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: depeng1994 <depengzhang@foxmail.com>
@github-actions
Copy link

github-actions bot commented Jun 5, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Collaborator

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message should also update.

The PR is good enough, just some nits see comments inline.

You can choose to address them in a separate PR.

* Use the non-blocking API `ProfileExecuteDuration().capture_async` to set observation points asynchronously when you need to observe the execution duration.
* Use the blocking API `ProfileExecuteDuration().pop_captured_sync` at an appropriate time to get and print the execution durations of all observed stages.

## Example Output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc is good but we could provide a e2e guid to help devs understand. Such as:


We already add key stage of inference (including pre-processing, model forward, etc.), you can execute inference script:

VLLM_ASCEND_MODEL_EXECUTE_TIME_OBSERVE=1 python3 vllm-ascend/examples/offline_inference_npu.py

for tag, duration in durations.items()
]
captured_name = "Decode" if self.attn_state == AscendAttentionState.DecodeOnly else "Prefill"
print(f"Profile execute duration [{captured_name}]:",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print or log?

@Yikun
Copy link
Collaborator

Yikun commented Jun 5, 2025

@ganyi1996ppo @wangxiyuan pls

@Yikun Yikun added the ready read for review label Jun 5, 2025
@wangxiyuan wangxiyuan merged commit 6b094a2 into vllm-project:main Jun 6, 2025
29 of 31 checks passed
MengqingCao pushed a commit that referenced this pull request Aug 19, 2025
I would like to nominate Mengqing Cao (@MengqingCao
https://github.com/MengqingCao) as a maintainer, starting with my +1.

## Reason

Review Quality‌: She has completed [120+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao)
since Feb. 2025, include
[#review-3077842852](#2088 (review)),
[comment-2990074116](#1032 (comment)),
[comment-2921063723](#1013 (comment))
high quality review.

Sustained and Quality Contributions: She has Deep understanding of
‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions include The
vLLM contributions and help vLLM Ascend release is the main reason I
nominated her:

- vLLM: Things worth mentioning that she completed [28+ PR
contributions](https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+)
in vllm-project/vllm, especially for vLLM platform module to improve
vLLM mult hardware support. She is one of the important co-authors of
[vllm#8054](vllm-project/vllm#8054) and hardware
plugin RFC, this makes vllm-ascend plugin possible.
Community Involvement: She is also very active and involved in [60+
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao).

So I think she's a great addition to the vLLM Ascend Maintainer team.

- ✅**Review Quality‌:**

She has completed 120+ reviews since Feb. 2025.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao,
include
#2088 (review),
#1446 (comment),
#1032 (comment),
#1013 (comment)
quality review.

- ✅**Sustained Contributions:**

99+ PR merged in vllm-project/vllm-ascend

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged

- ✅**Quality Contribution‌:**

She is one of the important co-authors of
vllm-project/vllm#8054 , this makes vllm-ascend
plugin possible.

Things worth mentioning that she complete 28+ PR contributions in
vllm-project/vllm, especially for vLLM platform module to improve vLLM
mult hardware support:

https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+.

At 2025 Q2, She also lead the [[RFC]: E2E CI test for key
features](#413) and
[[RFC]: Unit test coverage
improvement](#1298) to
help vllm ascend improve the coverage.

Her main contributions focus on the adaptation of parallel strategies
and communicator, such as
#1800,
#1856.

These contributions are sufficient to prove she has “Deep understanding
of ‌vLLM‌ and ‌vLLM Ascend‌ codebases”

- ✅**Community Involvement‌:**

Involved in 63+ issue reviewer
https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao

She led the v0.10.1 release as release manager


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@78dba40

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
wangxiaoteng888 pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Sep 25, 2025
I would like to nominate Mengqing Cao (@MengqingCao
https://github.com/MengqingCao) as a maintainer, starting with my +1.

## Reason

Review Quality‌: She has completed [120+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao)
since Feb. 2025, include
[#review-3077842852](vllm-project#2088 (review)),
[comment-2990074116](vllm-project#1032 (comment)),
[comment-2921063723](vllm-project#1013 (comment))
high quality review.

Sustained and Quality Contributions: She has Deep understanding of
‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions include The
vLLM contributions and help vLLM Ascend release is the main reason I
nominated her:

- vLLM: Things worth mentioning that she completed [28+ PR
contributions](https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+)
in vllm-project/vllm, especially for vLLM platform module to improve
vLLM mult hardware support. She is one of the important co-authors of
[vllm#8054](vllm-project/vllm#8054) and hardware
plugin RFC, this makes vllm-ascend plugin possible.
Community Involvement: She is also very active and involved in [60+
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao).

So I think she's a great addition to the vLLM Ascend Maintainer team.

- ✅**Review Quality‌:**

She has completed 120+ reviews since Feb. 2025.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao,
include
vllm-project#2088 (review),
vllm-project#1446 (comment),
vllm-project#1032 (comment),
vllm-project#1013 (comment)
quality review.

- ✅**Sustained Contributions:**

99+ PR merged in vllm-project/vllm-ascend

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged

- ✅**Quality Contribution‌:**

She is one of the important co-authors of
vllm-project/vllm#8054 , this makes vllm-ascend
plugin possible.

Things worth mentioning that she complete 28+ PR contributions in
vllm-project/vllm, especially for vLLM platform module to improve vLLM
mult hardware support:

https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+.

At 2025 Q2, She also lead the [[RFC]: E2E CI test for key
features](vllm-project#413) and
[[RFC]: Unit test coverage
improvement](vllm-project#1298) to
help vllm ascend improve the coverage.

Her main contributions focus on the adaptation of parallel strategies
and communicator, such as
vllm-project#1800,
vllm-project#1856.

These contributions are sufficient to prove she has “Deep understanding
of ‌vLLM‌ and ‌vLLM Ascend‌ codebases”

- ✅**Community Involvement‌:**

Involved in 63+ issue reviewer
https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao

She led the v0.10.1 release as release manager


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@78dba40

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
I would like to nominate Mengqing Cao (@MengqingCao
https://github.com/MengqingCao) as a maintainer, starting with my +1.

## Reason

Review Quality‌: She has completed [120+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao)
since Feb. 2025, include
[#review-3077842852](vllm-project#2088 (review)),
[comment-2990074116](vllm-project#1032 (comment)),
[comment-2921063723](vllm-project#1013 (comment))
high quality review.

Sustained and Quality Contributions: She has Deep understanding of
‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions include The
vLLM contributions and help vLLM Ascend release is the main reason I
nominated her:

- vLLM: Things worth mentioning that she completed [28+ PR
contributions](https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+)
in vllm-project/vllm, especially for vLLM platform module to improve
vLLM mult hardware support. She is one of the important co-authors of
[vllm#8054](vllm-project/vllm#8054) and hardware
plugin RFC, this makes vllm-ascend plugin possible.
Community Involvement: She is also very active and involved in [60+
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao).

So I think she's a great addition to the vLLM Ascend Maintainer team.

- ✅**Review Quality‌:**

She has completed 120+ reviews since Feb. 2025.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao,
include
vllm-project#2088 (review),
vllm-project#1446 (comment),
vllm-project#1032 (comment),
vllm-project#1013 (comment)
quality review.

- ✅**Sustained Contributions:**

99+ PR merged in vllm-project/vllm-ascend

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged

- ✅**Quality Contribution‌:**

She is one of the important co-authors of
vllm-project/vllm#8054 , this makes vllm-ascend
plugin possible.

Things worth mentioning that she complete 28+ PR contributions in
vllm-project/vllm, especially for vLLM platform module to improve vLLM
mult hardware support:

https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+.

At 2025 Q2, She also lead the [[RFC]: E2E CI test for key
features](vllm-project#413) and
[[RFC]: Unit test coverage
improvement](vllm-project#1298) to
help vllm ascend improve the coverage.

Her main contributions focus on the adaptation of parallel strategies
and communicator, such as
vllm-project#1800,
vllm-project#1856.

These contributions are sufficient to prove she has “Deep understanding
of ‌vLLM‌ and ‌vLLM Ascend‌ codebases”

- ✅**Community Involvement‌:**

Involved in 63+ issue reviewer
https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao

She led the v0.10.1 release as release manager


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@78dba40

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Oct 16, 2025
)

### What this PR does / why we need it?
We need to **observe the time consumed in each stage of inference
(including pre-processing, model forward, etc.), without any performance
loss**.
Therefore, we use the event timestamp mechanism of the NPU to mark any
stage during the execution of the NPU device (this marking operation is
executed asynchronously, with no performance loss).
Additionally, we provide a blocking synchronization API
`pop_captured_sync` to be called at an appropriate time, to print the
time consumed in all observed stages.

**model_runner_v1.py file only changed 5 lines, all of which were
`ProfileExecuteDuration()` calls, and nothing else was changed, while
more changes were showed due to the alignment issue.**

### Does this PR introduce _any_ user-facing change?
Use  env `VLLM_MODEL_EXECUTE_TIME_OBSERVE `to enable this feature

### How was this patch tested?

Tested in deepseek model,Print like this:
```
5691:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.17ms [prepare input and forward]:9.57ms [forward]:4.14ms
5695:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.29ms [prepare input and forward]:10.19ms [forward]:4.14ms
5697:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.81ms [prepare input and forward]:10.29ms [forward]:3.99ms
5701:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.10ms [prepare input and forward]:10.62ms [forward]:4.33ms
5705:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.65ms [prepare input and forward]:9.58ms [forward]:4.20ms
5709:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.43ms [prepare input and forward]:9.88ms [forward]:4.20ms
5711:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.89ms [prepare input and forward]:10.49ms [forward]:4.19ms
5715:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.14ms [prepare input and forward]:11.21ms [forward]:4.18ms
5719:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.71ms [prepare input and forward]:10.15ms [forward]:4.42ms
5723:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.31ms [forward]:4.25ms
5725:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.12ms [prepare input and forward]:10.33ms [forward]:4.24ms
5729:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.58ms [prepare input and forward]:10.85ms [forward]:4.32ms
5733:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.32ms [prepare input and forward]:9.79ms [forward]:4.28ms
5737:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:15.06ms [prepare input and forward]:9.89ms [forward]:4.32ms
5739:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.48ms [forward]:4.27ms
5743:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.60ms [prepare input and forward]:10.71ms [forward]:4.61ms
5747:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.21ms [prepare input and forward]:10.10ms [forward]:4.52ms
5751:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:15.03ms [prepare input and forward]:10.00ms [forward]:4.42ms

```

---------

Signed-off-by: depeng1994 <depengzhang@foxmail.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
)

### What this PR does / why we need it?
We need to **observe the time consumed in each stage of inference
(including pre-processing, model forward, etc.), without any performance
loss**.
Therefore, we use the event timestamp mechanism of the NPU to mark any
stage during the execution of the NPU device (this marking operation is
executed asynchronously, with no performance loss).
Additionally, we provide a blocking synchronization API
`pop_captured_sync` to be called at an appropriate time, to print the
time consumed in all observed stages.

**model_runner_v1.py file only changed 5 lines, all of which were
`ProfileExecuteDuration()` calls, and nothing else was changed, while
more changes were showed due to the alignment issue.**

### Does this PR introduce _any_ user-facing change?
Use  env `VLLM_MODEL_EXECUTE_TIME_OBSERVE `to enable this feature

### How was this patch tested?

Tested in deepseek model,Print like this:
```
5691:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.17ms [prepare input and forward]:9.57ms [forward]:4.14ms
5695:(IntegratedWorker pid=1502285) Profile execute duration [Decode]: [post process]:14.29ms [prepare input and forward]:10.19ms [forward]:4.14ms
5697:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.81ms [prepare input and forward]:10.29ms [forward]:3.99ms
5701:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.10ms [prepare input and forward]:10.62ms [forward]:4.33ms
5705:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.65ms [prepare input and forward]:9.58ms [forward]:4.20ms
5709:(IntegratedWorker pid=1502343) Profile execute duration [Decode]: [post process]:14.43ms [prepare input and forward]:9.88ms [forward]:4.20ms
5711:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.89ms [prepare input and forward]:10.49ms [forward]:4.19ms
5715:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.14ms [prepare input and forward]:11.21ms [forward]:4.18ms
5719:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.71ms [prepare input and forward]:10.15ms [forward]:4.42ms
5723:(IntegratedWorker pid=1502401) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.31ms [forward]:4.25ms
5725:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.12ms [prepare input and forward]:10.33ms [forward]:4.24ms
5729:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.58ms [prepare input and forward]:10.85ms [forward]:4.32ms
5733:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:14.32ms [prepare input and forward]:9.79ms [forward]:4.28ms
5737:(IntegratedWorker pid=1502462) Profile execute duration [Decode]: [post process]:15.06ms [prepare input and forward]:9.89ms [forward]:4.32ms
5739:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.62ms [prepare input and forward]:10.48ms [forward]:4.27ms
5743:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.60ms [prepare input and forward]:10.71ms [forward]:4.61ms
5747:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:14.21ms [prepare input and forward]:10.10ms [forward]:4.52ms
5751:(IntegratedWorker pid=1502524) Profile execute duration [Decode]: [post process]:15.03ms [prepare input and forward]:10.00ms [forward]:4.42ms

```

---------

Signed-off-by: depeng1994 <depengzhang@foxmail.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
I would like to nominate Mengqing Cao (@MengqingCao
https://github.com/MengqingCao) as a maintainer, starting with my +1.

## Reason

Review Quality‌: She has completed [120+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao)
since Feb. 2025, include
[#review-3077842852](vllm-project#2088 (review)),
[comment-2990074116](vllm-project#1032 (comment)),
[comment-2921063723](vllm-project#1013 (comment))
high quality review.

Sustained and Quality Contributions: She has Deep understanding of
‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions include The
vLLM contributions and help vLLM Ascend release is the main reason I
nominated her:

- vLLM: Things worth mentioning that she completed [28+ PR
contributions](https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+)
in vllm-project/vllm, especially for vLLM platform module to improve
vLLM mult hardware support. She is one of the important co-authors of
[vllm#8054](vllm-project/vllm#8054) and hardware
plugin RFC, this makes vllm-ascend plugin possible.
Community Involvement: She is also very active and involved in [60+
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao).

So I think she's a great addition to the vLLM Ascend Maintainer team.

- ✅**Review Quality‌:**

She has completed 120+ reviews since Feb. 2025.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+is%3Aclosed+commenter%3Amengqingcao+-author%3Amengqingcao,
include
vllm-project#2088 (review),
vllm-project#1446 (comment),
vllm-project#1032 (comment),
vllm-project#1013 (comment)
quality review.

- ✅**Sustained Contributions:**

99+ PR merged in vllm-project/vllm-ascend

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged

- ✅**Quality Contribution‌:**

She is one of the important co-authors of
vllm-project/vllm#8054 , this makes vllm-ascend
plugin possible.

Things worth mentioning that she complete 28+ PR contributions in
vllm-project/vllm, especially for vLLM platform module to improve vLLM
mult hardware support:

https://github.com/vllm-project/vllm/pulls?q=is%3Apr+author%3AMengqingCao+is%3Amerged+.

At 2025 Q2, She also lead the [[RFC]: E2E CI test for key
features](vllm-project#413) and
[[RFC]: Unit test coverage
improvement](vllm-project#1298) to
help vllm ascend improve the coverage.

Her main contributions focus on the adaptation of parallel strategies
and communicator, such as
vllm-project#1800,
vllm-project#1856.

These contributions are sufficient to prove she has “Deep understanding
of ‌vLLM‌ and ‌vLLM Ascend‌ codebases”

- ✅**Community Involvement‌:**

Involved in 63+ issue reviewer
https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20-author%3AMengqingCao%20commenter%3AMengqingCao

She led the v0.10.1 release as release manager


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@78dba40

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation module:core module:tests ready read for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants