-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HybridParallel]Fix c_split op for TensorParallel #33207
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for PADDLE_ENFORCE in kernel, please fix comment in next PR
PADDLE_ENFORCE_EQ( | ||
table_dims.size(), 2, | ||
platform::errors::InvalidArgument( | ||
"ShapeError: The dimensions of the 'c_embedding' must be 2. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove ShapeError:
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fix in next pr.
PR types
Function optimization
PR changes
Others
Describe
[HybridParallel]Fix c_split op for TensorParallel
优化MP性能,优化点如下:
1、fuse embedding op,对shard_index + embedding进行fuse,同时恢复embeding 的weight的正常维度。
2、解决_c_concat的api没有传入rank信息的bug。