-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
solved some npu bugs #32793
solved some npu bugs #32793
Conversation
Thanks for your contribution! |
|
||
for idx, op in enumerate(block.ops): | ||
if op.type == "check_finite_and_unscale": | ||
return idx | ||
|
||
raise ValueError("check_finite_and_unscale does not exist in block") | ||
if raise_error: | ||
raise ValueError("check_finite_and_unscale does not exist in block") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error message should be:
"amp is turn on but check_finite_and_unscale op does not exist in main block"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
accumulated_grad_names, | ||
core.op_proto_and_checker_maker.OpRole.Optimize, | ||
use_calc_stream=True) | ||
main_block, raise_error=self.user_defined_strategy.amp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason for this modification ? for npu hang bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To solove nup hang bugs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for Sharding
@@ -28,6 +28,7 @@ class CRecvOpASCENDKernel : public framework::OpKernel<T> { | |||
void Compute(const framework::ExecutionContext& ctx) const override { | |||
#if defined(PADDLE_WITH_ASCEND_CL) | |||
auto x = ctx.Output<framework::LoDTensor>("Out"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto x = ctx.Output<framework::LoDTensor>("Out"); | |
auto out = ctx.Output<framework::LoDTensor>("Out"); |
@@ -1467,7 +1467,7 @@ def linear(x, weight, bias=None, name=None): | |||
} | |||
tmp = helper.create_variable_for_type_inference(dtype) | |||
helper.append_op( | |||
type='matmul', inputs=inputs, outputs={'Out': tmp}, attrs=attrs) | |||
type='matmul_v2', inputs=inputs, outputs={'Out': tmp}, attrs=attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change here? Is it consistent?
raise ValueError("check_finite_and_unscale does not exist in block") | ||
if raise_error: | ||
raise ValueError( | ||
"amp is turn on but check_finite_and_unscale op does not exist in main block" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turn -> turned
if (type_ != "reshape2" && type_ != "reshape2_grad") { | ||
original_tensor->Resize(original_dims); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, it is a temp solution to change here, please add some notes.
python/paddle/fluid/dataset.py
Outdated
@@ -251,6 +251,8 @@ def set_use_var(self, var_list): | |||
slot_var.type = "float" | |||
elif var.dtype == core.VarDesc.VarType.INT64: | |||
slot_var.type = "uint64" | |||
elif var.dtype == core.VarDesc.VarType.INT32: | |||
slot_var.type = "uint32" | |||
else: | |||
raise ValueError( | |||
"Currently, fluid.dataset only supports dtype=float32 and dtype=int64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add INT32?
"for pipeline parallelism.") | ||
assert dev_type == "gpu" or dev_type == 'npu', ( | ||
"Now only gpu and npu devices are supported " | ||
"for pipeline parallelism.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to deal with npu:all?
HcclDataType dtype = platform::ToHCCLDataType(x->type()); | ||
auto out = ctx.Output<framework::LoDTensor>("Out"); | ||
out->mutable_data<T>(out->dims(), ctx.GetPlace()); | ||
void* ptr = reinterpret_cast<void*>(const_cast<T*>(out->data<T>())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not important.
void* ptr = reinterpret_cast<void*>(const_cast<T*>(out->data<T>())); | |
void* ptr = out->data<void>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看shard_index有一个deprecate的说明。
所以对这个API的维护,是在fluid下面原地修改,还是需要迁移一下?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for pp
PR types
Bug fixes
PR changes
Others
Describe
shared_index api support two types int32 and int64.
Input indices with data type int64 and int32. It's last dimension must be 1.
situation1:
situation2:
solved some npu bugs