New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

prepare_gradient_aggregation for non-leaf output of PartialProgramLayer #44893

Merged

2742195759 merged 3 commits into PaddlePaddle:develop from 2742195759:add-scale-for-non-leaf-node

Aug 10, 2022

Contributor

2742195759 commented Aug 4, 2022 •

edited

Loading

PR types

Others

PR changes

Others

Describe

add prepare_gradient_aggregation in PartialProgramLayer.
add unittest case for prepare_gradient_aggregation.


          1. add prepare_gradient_aggregation in PartialProgramLayer

e608034

paddle-bot bot commented Aug 4, 2022

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Aurelius84 reviewed

View reviewed changes

python/paddle/fluid/dygraph/dygraph_to_static/partial_program.py

+                              for in_arg in op.input_arg_names:
+                                  if in_arg == var.name:
+                                      return True
+                          return False

Contributor

Aurelius84 Aug 4, 2022

reuturn var.name in op.input_arg_names

Contributor Author

2742195759 Aug 10, 2022

这样逻辑不等价了好像。

python/paddle/fluid/dygraph/dygraph_to_static/partial_program.py Outdated

@@ @@ -287,6 +287,63 @@ def _verify_program(self, main_program): @@
                       return main_program
+                  def prepare_gradient_aggregation(self, main_program, target_program):
+                      # Why we need add Reverse gradient aggregation operation ?

Contributor

Aurelius84 Aug 4, 2022

函数注释最好用
""""
xxxx
"""""
格式

python/paddle/fluid/dygraph/dygraph_to_static/partial_program.py

+                                  lambda x: any([
+                                      out_arg == var_grad_name
+                                      for out_arg in x[1].output_arg_names
+                                  ]), enumerate(target_program.block(0).ops)))

Contributor

Aurelius84 Aug 4, 2022

这里为什么要用 enumerate ？

Contributor Author

2742195759 Aug 10, 2022

这个enumerate得到的值是插入的idx，后续插入Op会用到的。

python/paddle/fluid/dygraph/dygraph_to_static/partial_program.py Outdated

+                          return False
+                      def _insert_aggregation_ops_for_var(target_program, var):
+                          var_grad_name = var.name + "@GRAD"

Contributor

Aurelius84 Aug 4, 2022

这里最好不要写死 + "@Grad" ，grad后缀框架是有统一的API的

python/paddle/fluid/dygraph/dygraph_to_static/partial_program.py Outdated

+                          # len(finded_ops) may > 1, because we may have fill_constant op.
+                          if len(finded_ops) == 0:
+                              return None
+                          suffix = "@dy2static"

Contributor

Aurelius84 Aug 4, 2022 •

edited

Loading

这里是不是最好使用 var_name + _dy2static + grad_suffix。

2742195759 added 2 commits

August 9, 2022 11:17


          1. draft

39e923a


          fix ci problem

dcf2d5f

Aurelius84 approved these changes

View reviewed changes

Contributor

Aurelius84 left a comment

LGTM

2742195759 merged commit f694e99 into PaddlePaddle:develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet