-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapting device-specific Extra Attributes for the PHI kernel #46342
Adapting device-specific Extra Attributes for the PHI kernel #46342
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
… adapto_extra_attr_in_phi
… adapto_extra_attr_in_phi
… adapto_extra_attr_in_phi
… adapto_extra_attr_in_phi
… adapto_extra_attr_in_phi
…l/Paddle into adapto_extra_attr_in_phi
2ea1a2b
… adapto_extra_attr_in_phi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
{"fuse_alpha", ExtraAttrProperty::ONEDNN}, | ||
{"fuse_beta", ExtraAttrProperty::ONEDNN}, | ||
{"fuse_brelu", ExtraAttrProperty::ONEDNN}, | ||
{"fuse_brelu_threshold", ExtraAttrProperty::ONEDNN}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fuse_brelu
or fuse_brelu_threshold
are no longer used by any oneDNN kernel. Is there any compatibility issue preventing us from deleting these attrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, I deleted these two attributes, if there are other attributes that are deprecated, they can also be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will update this list along with second phase of kernel migration
{"use_cudnn", ExtraAttrProperty::SCHEDULE}, | ||
{"use_mkldnn", ExtraAttrProperty::SCHEDULE}, | ||
// ONEDNN dedicated attributes | ||
{"Bias", ExtraAttrProperty::ONEDNN}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can new extra oneDNN-specific attributes be added in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the process of kernel migration, if there are missing attributes, you can add them temporarily . However, in the long run, it is recommended to add fusion op and kernel, and replace the standard op with fusion op in the pass processing stage to support additional functions, so as to avoid the increasingly complex implementation of the basic Op kernel.
We've run first performance and functional tests and this PR passed everything with expected results. We will run one more test on other platform, to make sure that everything is ok. Tests will be finished by the end of this week. |
Co-authored-by: Sławomir Siwek <slawomir.siwek@intel.com>
…l/Paddle into adapto_extra_attr_in_phi
… adapto_extra_attr_in_phi
@Silv3S Hello, are the tests finished? |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
const std::vector<std::string>& GetInputsName( | ||
const std::string& input) const { | ||
auto it = inputs_name_.find(input); | ||
PADDLE_ENFORCE_NE(it, | ||
inputs_name_.end(), | ||
phi::errors::NotFound( | ||
"OneDnnContext does not have the input %s.", input)); | ||
return it->second; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think name of this function is similar to SetInputsName
but getter doesn't return what setter is setting which might be confusing at first sight. This could have a more distinct or precise name.
The same goes for SetOutputsName
and GetOutputsName
.
// Holds some attributes only used by the onednn kernel calculation | ||
// Since original mkldnn op kernel directly adds the operations that require | ||
// fusion to the native kernel operations, and uses the attribute `fuse_xxx` | ||
// to control, for onednn, there will be some attributes that seem to be | ||
// independent of the device are also saved here. | ||
// Here, the operation of fusion needs to be implemented separately as | ||
// a fusion op and kernel, instead of patching it to a basic operation. | ||
// Because DeviceContext is a global singleton, you need to ensure thread | ||
// safety, use the thread_local variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Holds some attributes only used by the onednn kernel calculation | |
// Since original mkldnn op kernel directly adds the operations that require | |
// fusion to the native kernel operations, and uses the attribute `fuse_xxx` | |
// to control, for onednn, there will be some attributes that seem to be | |
// independent of the device are also saved here. | |
// Here, the operation of fusion needs to be implemented separately as | |
// a fusion op and kernel, instead of patching it to a basic operation. | |
// Because DeviceContext is a global singleton, you need to ensure thread | |
// safety, use the thread_local variable | |
// Holds some attributes only used by the onednn kernel calculation. | |
// Since original onednn op kernel directly adds the operations that require | |
// fusion to the native kernel operations, and uses the attribute `fuse_xxx` | |
// to control, for onednn, there will be some attributes that seem to be | |
// independent of the device are also saved here. | |
// Here, the operation of fusion needs to be implemented separately as | |
// a fusion op and kernel, instead of patching it to a basic operation. | |
// Because DeviceContext is a global singleton, you need to ensure thread | |
// safety, use the thread_local variable |
PR types
New features
PR changes
Others
Describe
Adapting device-specific Extra Attributes for the PHI kernel
After cleaned up most of the Extra parameters in OpMaker, it is necessary to remove the parameters belonging to the Extra range in the PHI Kernel parameter list. This is a necessary process to standardize the definition of the Kernel, and can also reduce the understanding cost of other hardware manufacturers accessing the Paddle Kernel. For example:
The function signature of the current
conv2d
kernel is as follows:And the Python API signature of conv2d is as follows:
We can found that the following three parameters in the Kernel signature have nothing to do with the python api
In fact, these three parameters are dedicated to the cudnn kernel of conv2d, but since there can only be one signature of the Kernel, CPU, GPU, XPU in paddle, and NPU. MLU in the CustomDevice need to have these three parameters for conv2d kernel. In fact, for these hardware, these three parameters are meaningless, which increases the understanding cost of heterogeneous Kernel development, so we need to remove such parameters. If you add MKLDNN, conv2d has more such parameters like follows:
We cannot add such parameters to the public Kernel signature to affect the development experience of many hardware kernels. Since the parameter dedicated to a certain hardware or acceleration library, it should be passed in in a dedicated way.
However, this removal action needs to ensure that the current MKLDNN and CUDNN Kernel functions are compatible. For the Kernels of MKLDNN and CUDNN, these removed parameters are still needed at present, so they are removed from the parameter list, but need to passed into the kernel by other ways, this passing process has the following limitations:
Under the premise of the existence of the above restrictions, we can only pass it in through the existing parameters of the Kernel. At present, it seems more appropriate to pass in the Context dedicated to the Kernel:
This PR adopts this scheme to adapt the existing MKLDNN and CUDNN dedicated parameters.
Specifically:
dnn_attrs_
member to store special parameters in OneDNNContext and GPUContextconv2d
kernel, obtain the cudnn-specific parameters through dev_ctx in the CUDNN Kernel, and verify it through unittestsconv2d
kernel signature, migrate mkldnn'sconv2d
kernel to phi, and also obtain mkldnn-specific parameters through dev_ctx, and verify it through unittestsOther Changing:
TODO: