Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PaddleInference]compile optimization of weight_only_linear #56706

Merged
merged 10 commits into from
Sep 14, 2023

Conversation

lizhenyun01
Copy link
Contributor

@lizhenyun01 lizhenyun01 commented Aug 28, 2023

PR types

Function optimization

PR changes

Others

Description

对weight_only的gemm kernel进行编译分离以提高编译并行度,加快编译速度
编译时间优化从约20m到约1min
Pcard-74871

@paddle-bot
Copy link

paddle-bot bot commented Aug 28, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@lizhenyun01 lizhenyun01 changed the title Weight only [PaddleInference]compile optimization of weight_only_linear Aug 28, 2023
Copy link
Contributor

@vivienfanghuagood vivienfanghuagood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有个问题,这些cta_shape会不会在不同的SM架构下会编译失败?

@lizhenyun01
Copy link
Contributor Author

有个问题,这些cta_shape会不会在不同的SM架构下会编译失败?

之前应该就是矩阵式覆盖都支持的,我可以再验证下

}
"""

archs = [70, 75, 80]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

archs不建议写死,做成脚本的参数吧~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有考虑这个,同时只针对需要的arch编译,不过gemm_dispatch里有一处显式调用了70, 75, 80,我看看有没有方案配合改写一下

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Sep 12, 2023

Sorry to inform you that 8b27b5a's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@heavengate heavengate merged commit 54864b6 into PaddlePaddle:develop Sep 14, 2023
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
@TimeYWL
Copy link
Contributor

TimeYWL commented Jan 19, 2024

没有paddle/phi/kernels/fusion/cutlass/cutlass_kernels/fpA_intB_gemm/autogen/arch_define.h这个头文件啊

@lizhenyun01 lizhenyun01 deleted the weight_only branch July 18, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants