-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine distribute transpiler #8241
Conversation
@@ -376,7 +385,7 @@ def _append_pserver_ops(self, program, pserver_program, opt_op, endpoint): | |||
# param is already created on global program | |||
param_block = None | |||
for p in self.param_grad_ep_mapping[endpoint]["params"]: | |||
if same_or_split_var(p.name, var.name): | |||
if same_or_split_var(p.name, opt_op.input(key)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will opt_op.input(key)
work, do we need opt_op.input(key)[0]
if key in ["Param", "Grad"]: | ||
continue | ||
# update accumulator variable shape | ||
param_shape = new_inputs["Param"].shape | ||
var = program.global_block().vars[opt_op.input(key)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will opt_op.input(key)
work, do we need opt_op.input(key)[0]
dtype=var.dtype, | ||
shape=var.shape) | ||
try: | ||
pserver_program.global_block().create_var( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we lookup the var, compare the already created var with the current var, log error if they are not equal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and updated. Find var from block, not adding a error log because it's not an error, this will be refined due to #7700
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just to double check, wouldn't it be an error if with they are with the same name but the configuration of var (e.g., dim) differ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current implementation, listen_and_serv
run optimize_sub_program
with an executor that does not create variable in the sub scope (create_vars=false
). The variables must only exist in pserver_program
, just for compile time to pass, we must create variables here. But, once we put the optimize_sub_program
to the sub-block of pserver_program
this will not be a problem. Will do this in next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Fix #8225
Related of work: #8049