[PaddleInference]compile optimization of weight_only_linear #56706

lizhenyun01 · 2023-08-28T07:11:34Z

PR types

Function optimization

PR changes

Others

Description

对weight_only的gemm kernel进行编译分离以提高编译并行度，加快编译速度
编译时间优化从约20m到约1min
Pcard-74871

paddle-bot · 2023-08-28T07:11:40Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

vivienfanghuagood

有个问题，这些cta_shape会不会在不同的SM架构下会编译失败？

lizhenyun01 · 2023-08-28T07:30:26Z

有个问题，这些cta_shape会不会在不同的SM架构下会编译失败？

之前应该就是矩阵式覆盖都支持的，我可以再验证下

vivienfanghuagood · 2023-08-30T03:08:44Z

...hi/kernels/fusion/cutlass/cutlass_kernels/fpA_intB_gemm/generic_mixed_gemm_kernelLauncher.py

+}
+"""
+
+archs = [70, 75, 80]


archs不建议写死，做成脚本的参数吧～

有考虑这个，同时只针对需要的arch编译，不过gemm_dispatch里有一处显式调用了70, 75, 80，我看看有没有方案配合改写一下

paddle-ci-bot · 2023-09-12T03:05:49Z

Sorry to inform you that 8b27b5a's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

…ddle#56706) * separately-compiled fpA_intB_gemm

TimeYWL · 2024-01-19T01:52:01Z

没有paddle/phi/kernels/fusion/cutlass/cutlass_kernels/fpA_intB_gemm/autogen/arch_define.h这个头文件啊

lizhenyun01 added 2 commits August 25, 2023 17:54

separately-compiled fpA_intB_gemm

3e34929

add instance of func

0f5108f

lizhenyun01 changed the title ~~Weight only~~ [PaddleInference]compile optimization of weight_only_linear Aug 28, 2023

vivienfanghuagood reviewed Aug 28, 2023

View reviewed changes

add code_gen mixed_gemm_kernelLauncher instance

4443e7f

vivienfanghuagood reviewed Aug 30, 2023

View reviewed changes

lizhenyun01 added 7 commits August 30, 2023 11:30

ignore autogen dir

00cb58d

fix weight_only_gemv compile in sm_52

faa3aa2

Merge remote-tracking branch 'upstream/develop' into weight_only

d756bba

separate compile of cuda_arch

f802939

Merge remote-tracking branch 'upstream/develop' into weight_only

49fcf62

sremove print

c32a0be

fix codegen

8b27b5a

heavengate approved these changes Sep 14, 2023

View reviewed changes

heavengate merged commit 54864b6 into PaddlePaddle:develop Sep 14, 2023

danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023

[PaddleInference]compile optimization of weight_only_linear (PaddlePa…

932e1e5

…ddle#56706) * separately-compiled fpA_intB_gemm

lizhenyun01 deleted the weight_only branch July 18, 2024 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PaddleInference]compile optimization of weight_only_linear #56706

[PaddleInference]compile optimization of weight_only_linear #56706

lizhenyun01 commented Aug 28, 2023 •

edited

Loading

paddle-bot bot commented Aug 28, 2023

vivienfanghuagood left a comment

lizhenyun01 commented Aug 28, 2023

vivienfanghuagood Aug 30, 2023

lizhenyun01 Aug 30, 2023

paddle-ci-bot bot commented Sep 12, 2023

TimeYWL commented Jan 19, 2024

[PaddleInference]compile optimization of weight_only_linear #56706

[PaddleInference]compile optimization of weight_only_linear #56706

Conversation

lizhenyun01 commented Aug 28, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Aug 28, 2023

vivienfanghuagood left a comment

Choose a reason for hiding this comment

lizhenyun01 commented Aug 28, 2023

vivienfanghuagood Aug 30, 2023

Choose a reason for hiding this comment

lizhenyun01 Aug 30, 2023

Choose a reason for hiding this comment

paddle-ci-bot bot commented Sep 12, 2023

TimeYWL commented Jan 19, 2024

lizhenyun01 commented Aug 28, 2023 •

edited

Loading