-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nn.Graph reuse eager lbn without create duplicate variable op #6981
Conversation
这个pr是不是也能同时解决这个问题 https://github.com/Oneflow-Inc/OneTeam/issues/827 |
是的。一并解决 |
@@ -31,6 +31,9 @@ class TensorNameScope { | |||
|
|||
void Record(const std::shared_ptr<Tensor>& tensor, const std::string& name); | |||
|
|||
// NOTE(chengcheng): TensorNameScope need to be cleared after current graph build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
接口这里应该不需要加使用那里需要的注释
if (!opt_lbn.empty()) { | ||
// NOTE(chengcheng): This eager tensor has been fed as variable op before, so we just use the | ||
// lbn, and will NOT create duplicate variable op again. | ||
(*outputs)[0] = input_tensor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里记得计划改成返回一个lazy tensor?
本 PR 关闭。后续的工作由:
分别支持。 |
nn.Graph 捕获 nn.Module 中的 Eager Tensor(parameters) 时,没有考虑 Tensor 传入会重复的问题,导致同一个 tensor 指针(但是在 module 中有多个不同的 name )会被重复创建 Variable Op,虽然不会影响正确性(多个 Variable Op 绑定相同的 Tensor 内存),但是会在多卡情形下造成额外的梯度同步操作,影响性能。
此问题由 issue:
反馈。
该 PR 还支持了 每次 Graph build 完,清空 TensorNameScope 的功能,解决:
中反应的问题