-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU预测模型开启MKLDNN后报错 #25537
Comments
看起来报错是在elementwise op上 elementwise op的两个输入维度不对 如报错,两个shape应该完全相等或者可以broadcast。 可以使用fluid.layers.Print() api 打印下某个输出看是否符合预期 |
@jczaja Could you help see this issue? The model/data/codes are already emailed to @lidanqing-intel |
@luotao1 Please forward model/codes to me as @lidanqing-intel is out of office |
@lidanqing-intel Could this issue be fixed in 1.8.5(next month)? |
It is related to fact that some tensors are having data_layout set to NHWC , while model is NCHW. Status: investigating. |
@luotao1 I'm sorry for late response as I was away from office. I have just resumed investigation on this problem and will do our best to have it fixed. |
@luotao1 I would like to share some findings and ask for advice. Reason why there is a crash is that this model is having pool2d ops and those ops are having an attribute : data_format , which is used to indicate if model is working on NCHW or NHWC data. In the past it was agreed that either all data_format attribs are set to NCHW or all are set to NHWC , there shouldn't be scenario where some operators are working with data in NCHW and some others in data arranged in NHWC. Problem is that in this model there are pool2d ops and their's data_format values are diffrent. for example pool2d(id=139) is having NHWC: PaddlePaddle oneDNN integration does not support situation where some operators are to work on NCHW and some others on NHWC. Is this intentional that two diffrent pool2d ops are using diffrent data_format ? |
Discussed with @phlrain, it is not reasonable that two different pool2d ops are using different data_format. @phlrain and @wanghuancoder will help see the training logical at first. Thanks for the analysis of @jczaja! |
@jczaja @lidanqing-intel We make two different pool2d ops use the same data_format, and train a new model for inference. |
@luotao1 I have question regarding this situation. Model is to be executed using NHWC input data as pool2d ops are having |
Yes, it is. matmul should not have data_format attribute, and its implementation will work properly regardless any data_format used. |
@OliverLPH Could you help test it? |
版本、环境信息:
1)PaddlePaddle版本:1.8.1
2)CPU:Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz, MKLDNN
3)GPU:无
4)系统环境:CentOS release 6.3 (Final)、Python 2.7.15
复现信息:
百度内部提出,复现代码和模型,请hi上联系 guguiyuan
问题描述:
模型使用了pool2d op开启mkldnn报错,未使用的可正常运行
错误信息如下
The text was updated successfully, but these errors were encountered: