-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor recompute #45348
Refactor recompute #45348
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
python/paddle/distributed/fleet/meta_parallel/pp_utils/p2p_communication.py
Outdated
Show resolved
Hide resolved
这个 PR 可以支持 数据并行吗? 不用手动 merge 梯度的那种。 |
需要增加 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个事情急不得!
python/paddle/distributed/fleet/meta_parallel/parallel_layers/pp_layers.py
Outdated
Show resolved
Hide resolved
python/paddle/fluid/tests/unittests/collective/multinode/dygraph_hybrid_recompute.py
Outdated
Show resolved
Hide resolved
python/paddle/fluid/tests/unittests/collective/fleet/test_dygraph_recompute_for_eager.py
Show resolved
Hide resolved
python/paddle/fluid/tests/unittests/collective/fleet/test_dygraph_recompute.py
Outdated
Show resolved
Hide resolved
a94f5e1
to
a29cb3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
python/paddle/distributed/fleet/meta_parallel/pp_utils/utils.py
Outdated
Show resolved
Hide resolved
@@ -30,7 +30,10 @@ | |||
ch.setFormatter(formatter) | |||
logger.addHandler(ch) | |||
|
|||
__all__ = [] | |||
__all__ = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不用重复放到__all__里面,这个API就更长了
paddle.distributed.fleet.recompute.recompute.recompute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
赞同非对外的函数不用放到__all__中。在这个模块的__init__.py
中放置一下作为这个模块的接口就可以了。
但是为何这样调用paddle.distributed.fleet.recompute.recompute.recompute
,不太理解!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
a99c046
4af8ad5
to
a249601
Compare
Since you haven't replied for more than a year, we have closed this issue/pr. |
PR types
Others
PR changes
Others
Describe
Refacto'r recompute api,
Users can use these fleet api about recompute:
1、fleet.recompute()
2、fleet.recompute_sequential() #for sequential model
3、fleet.recompute_hybrid() #hybrid parallel, recompute support offload and activation functions.
Users are not aware of these.
remarks:
(1)'fleet.recompute_sequential' and 'fleet.recompute_hybrid' are newly added API.
(2)the dir root of 'recompute' API is changed for easy usability.