Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CINN][Add Backend Pass Comment No.10] Add comment for replace_cross_thread_reduction #70227

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

KDZZZZZZ
Copy link

PR types
CINN

PR changes
Others

Description
为replace_cross_thread_reduction Pass添加了注释

Copy link

paddle-bot bot commented Dec 14, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Dec 14, 2024
@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Dec 16, 2024
/**
* A pass that optimizes cross-thread reduction operations on GPU by replacing them with more efficient implementations.
*
* [Detailed application scenario]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

方括号的为模板的提示,在正式注释里删掉即可

Comment on lines 28 to 30
* Replace cross thread reduction to external call.
*/
void ReplaceCrossThreadReduction(ir::LoweredFunc fn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释写在函数上方

Comment on lines 71 to 75
* TODO:
* - Support more reduction operations (e.g., custom reduction functions)
* - Add dynamic selection of reduction methods based on input size
* - Optimize shared memory allocation for better bank conflicts avoidance
* - Add support for multi-warp reductions within a block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原函数没有的TODO不用加了,这个是可选项

Copy link

paddle-ci-bot bot commented Dec 26, 2024

Sorry to inform you that f48074a's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

Comment on lines 35 to 40
* This pass is applicable in scenarios where multiple GPU threads need to perform reduction operations (like sum, max, min)
* across thread boundaries. These scenarios are common in deep learning workloads, particularly in operations like:
* - Computing sum/mean across feature dimensions
* - Global pooling operations
* - Softmax normalization
* - Gradient aggregation in distributed training
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个描述得太高层了,需要描述一下在后端IR优化上的场景

*
*
* When applied, this pass will:
* 1. Identify reduction operations in GPU-bound loops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里主要做cross_thread的

Comment on lines 73 to 77
* for (i = 0; i < 1024; i++) {
* if (i < n) {
* sum += data[i];
* }
* }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ir里面应该体现一下gpu thread bind

@luotao1
Copy link
Contributor

luotao1 commented Jan 7, 2025

PR-CI-Codestyle-Check 需要通过

@luotao1
Copy link
Contributor

luotao1 commented Jan 13, 2025

@KDZZZZZZ PR-CI-Codestyle-Check 还没有通过,请再更新下。

  • 仅修改文档,可以在commit信息中加 test=document_fix 来加速CI
  • 1月14日晚18:00还没有更新,将转由飞桨研发接手。

cc @Hongqing-work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers HappyOpenSource 快乐开源活动issue与PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants