Skip to content

Conversation

@cqulilujia
Copy link
Contributor

PR Category

Custom Device

PR Types

New features

Description

Support python memory api in XPU, including max_memory_allocated, max_memory_reserved, reset_max_memory_allocated, reset_max_memory_reserved, memory_allocated, memory_reserved

@paddle-bot
Copy link

paddle-bot bot commented Jun 9, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the XPU label Jun 9, 2025
@cqulilujia cqulilujia force-pushed the memory branch 3 times, most recently from bc6ce4f to b97660d Compare June 9, 2025 07:26
@codecov-commenter
Copy link

codecov-commenter commented Jun 9, 2025

Codecov Report

❌ Patch coverage is 14.86486% with 63 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@1f27316). Learn more about missing BASE report.

Files with missing lines Patch % Lines
python/paddle/device/xpu/__init__.py 15.27% 61 Missing ⚠️
python/paddle/distributed/launch/utils/nvsmi.py 0.00% 2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (14.86%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #73189   +/-   ##
==========================================
  Coverage           ?   14.86%           
==========================================
  Files              ?        2           
  Lines              ?       74           
  Branches           ?        0           
==========================================
  Hits               ?       11           
  Misses             ?       63           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cqulilujia
Copy link
Contributor Author

/re-run all-failed

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Jun 30, 2025

Sorry to inform you that b7bad79's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@cqulilujia
Copy link
Contributor Author

/re-run cpu

@cqulilujia
Copy link
Contributor Author

/re-run all-failed

) # with MB
mem_used = (
core.get_xpu_device_used_memory(dev_id) / 1024 / 1024
) # with MB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
模型实测无问题

Copy link
Contributor

@lj970926 lj970926 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@RuohengMa RuohengMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@HarperCy HarperCy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +32 to +46
__all__ = [
'synchronize',
'device_count',
'set_debug_level',
'empty_cache',
'max_memory_allocated',
'max_memory_reserved',
'reset_max_memory_allocated',
'reset_max_memory_reserved',
'memory_allocated',
'memory_reserved',
'memory_total', # memory maneged by runtime, not paddle
'memory_used', # memory maneged by runtime, not paddle
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以新增这些API,但按照新增API的要求,需要在 docs reop 中增加中文文档,方便用户在Paddle官网搜索和使用,详细可以参考官网新增API文档说明

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已提文档PR docs仓库PR,由于文档中需要调用代码信息,因此需要paddle仓库先合入,docs仓库才能合入。

@cqulilujia
Copy link
Contributor Author

/re-run all-failed

@cqulilujia cqulilujia closed this Jul 18, 2025
@cqulilujia cqulilujia reopened this Jul 18, 2025
Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

需要在 docs 仓库同步中文文档

@cqulilujia
Copy link
Contributor Author

/re-run all-failed

@cqulilujia
Copy link
Contributor Author

/re-run dcu

@cqulilujia
Copy link
Contributor Author

/re-run all-failed

@QingshuChen QingshuChen merged commit 229e0a6 into PaddlePaddle:develop Jul 21, 2025
105 of 111 checks passed
@cqulilujia cqulilujia deleted the memory branch August 28, 2025 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants