Skip to content

Conversation

@shen-shanshan
Copy link
Collaborator

@shen-shanshan shen-shanshan commented May 28, 2025

What this PR does / why we need it?

Clear NPU memory between test profiling.

If we don't clear the memory this way, the CI may get OOM error between UTs.

This modification is adapted from vllm/worker/worker.py, find more details at https://github.com/vllm-project/vllm/blob/main/vllm/worker/worker.py#L184-L186.

Does this PR introduce any user-facing change?

How was this patch tested?

@MengqingCao
Copy link
Collaborator

MengqingCao commented May 28, 2025

Thanks for this fixing! I think we can derectly move gc collection and peak mem status reset into NPUPlatform.empty_cache()

@shen-shanshan
Copy link
Collaborator Author

Thanks for this fixing! I think we can derectly move gc collection and peak mem status reset into NPUPlatform.empty_cache()

I have assembled the 3 methods into clear_npu_memory() in platform.

@shen-shanshan
Copy link
Collaborator Author

@wangxiyuan The CI is passed, and this PR is may needed for #969.

@MengqingCao
Copy link
Collaborator

Thanks for this fixing! I think we can derectly move gc collection and peak mem status reset into NPUPlatform.empty_cache()

I have assembled the 3 methods into clear_npu_memory() in platform.

Why not just do it in https://github.com/vllm-project/vllm-ascend/pull/989/files#diff-efe99563375ba645b1c1befb237632c2c91f85b21d2ab2d1095eef82aecd3999L106

@shen-shanshan
Copy link
Collaborator Author

Thanks for this fixing! I think we can derectly move gc collection and peak mem status reset into NPUPlatform.empty_cache()

I have assembled the 3 methods into clear_npu_memory() in platform.

Why not just do it in https://github.com/vllm-project/vllm-ascend/pull/989/files#diff-efe99563375ba645b1c1befb237632c2c91f85b21d2ab2d1095eef82aecd3999L106

Because maybe there are some other scenarios just need empty_cache() and don't need do the other 2 methods.

@wangxiyuan
Copy link
Collaborator

please merge this pr and #969 into one

@github-actions
Copy link

github-actions bot commented Jun 4, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@Potabk
Copy link
Collaborator

Potabk commented Jun 5, 2025

hope can merge, sleep mode v1 need this

@shen-shanshan
Copy link
Collaborator Author

hope can merge, sleep mode v1 need this

I will rebase soon.

@github-actions
Copy link

github-actions bot commented Jun 5, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

github-actions bot commented Jun 6, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

github-actions bot commented Jun 9, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

# Profile the memory usage of the model and get the maximum number of
# cache blocks that can be allocated with the remaining free memory.
NPUPlatform.empty_cache()
clear_npu_memory()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the two memory clear methods?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@momo609 clear_npu_memory() = gc.collect() + empty_cache() + reset_peak_memory_stats() (which will make the profile in dummy run more accurate).

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@shen-shanshan
Copy link
Collaborator Author

CC: @wangxiyuan

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
@codecov
Copy link

codecov bot commented Jun 23, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 27.34%. Comparing base (c30ddb8) to head (7c433f1).
⚠️ Report is 563 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/utils.py 40.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #989      +/-   ##
==========================================
- Coverage   27.39%   27.34%   -0.05%     
==========================================
  Files          56       56              
  Lines        6191     6183       -8     
==========================================
- Hits         1696     1691       -5     
+ Misses       4495     4492       -3     
Flag Coverage Δ
unittests 27.34% <40.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants