-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CINN] Dump more compilation result and optimize parallel compiler flags #55935
Conversation
1. Unify FLAGS_cinn_parallel_compile_size and FLAGS_cinn_parallel_compile_thread 2. Add more flags to dump more compile info 3. Support dump lower_func, source_code, ptx and instruction 4. Return more compile info from parallel compiler to graph compiler 5. Refactor graph visualization 6. Refactor parallel compiler's task split
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ags (PaddlePaddle#55935) 1. `Parallel Compiler`: - 合并`FLAGS_cinn_parallel_compile_size`和`FLAGS_cinn_parallel_compile_thread`,通过`FLAGS_cinn_parallel_compile_thread`即可指定编译时使用的线程数,所有的`fusion_groups`将会平均分配到可用的线程上 - 增强编译完成后返回的信息,除`instruction`外,将`lowered_function`、`source_code`、`source_ptx`返回,供上层进一步使用 2. Debug信息: - 新增`FLAGS_ cinn_dump_group_lowered_func`、`FLAGS_cinn_dump_group_source_code`、`FLAGS_ cinn_dump_group_ptx`、`FLAGS_ cinn_dump_group_instruction`,可分别按`fusion_groups`储存编译的每个阶段中的中间代码 - 重新整理`graph_visualization`,所有的可视化图、单测代码均能正确分组储存 3. Bug修复: - 修复`MakeDirectory`不能正确创建文件夹的问题 4. 其他: - 清除了一些无用代码
PR types
Others
PR changes
Others
Description
Pcard-72511
Parallel Compiler
:FLAGS_cinn_parallel_compile_size
和FLAGS_cinn_parallel_compile_thread
,通过FLAGS_cinn_parallel_compile_thread
即可指定编译时使用的线程数,所有的fusion_groups
将会平均分配到可用的线程上instruction
外,将lowered_function
、source_code
、source_ptx
返回,供上层进一步使用FLAGS_ cinn_dump_group_lowered_func
、FLAGS_cinn_dump_group_source_code
、FLAGS_ cinn_dump_group_ptx
、FLAGS_ cinn_dump_group_instruction
,可分别按fusion_groups
储存编译的每个阶段中的中间代码graph_visualization
,所有的可视化图、单测代码均能正确分组储存MakeDirectory
不能正确创建文件夹的问题