Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance recurrent layer performance #6512

Closed
perchbird opened this issue Dec 12, 2017 · 3 comments · Fixed by #6719
Closed

Enhance recurrent layer performance #6512

perchbird opened this issue Dec 12, 2017 · 3 comments · Fixed by #6719

Comments

@perchbird
Copy link

paddle 中的 RecurrentLayer 通过调用 mkl library 里的 cblas_sgemm 对每个time state进行计算,而cblas_sgemm 这个函数内部在调用 compute 进行计算之前会先使用pack操作将数据转换为适合mkl engine的packed格式。在RNN的case下,同一次forward过程中所有time state共享同一个weight,没有必要在每次compute前对weight进行重复的pack操作。因此,intel 的优化方案是通过调用mkl library 里cblas_sgemm内部的函数,在计算前先完成一次对weight的pack操作,然后每个time state在计算时复用同一个已pack过后的weight,而无须重复进行相对耗时的pack操作。

在merge时,可以选择直接在原有的RecurrentLayer上做修改,也可以重新新建一个作为MKLDNNLayer。但由于并没有使用mkldnn library,名字上有些不合适。请问哪一种方案比较好?

@luotao1
Copy link
Contributor

luotao1 commented Dec 12, 2017

@hedaoyuan 建议:如果优化代码与原先代码相关性不大的话,可以单独实现一个Layer,类似卷积有ExpandConvLayer和CudnnConvLayer。

但由于并没有使用mkldnn library,名字上有些不合适

取名为MKLRecurrentLayer?

@yao-matrix
Copy link

@luotao1 从我们角度来讲都是可以的,主要看哪种方式更match PaddlePaddle的design philosophy。
另外,caffe2的做法可供参考: https://github.com/caffe2/caffe2/blob/master/caffe2/mkl/operators/packed_fc_op.cc
他们的想法本质上是跟道远的差不多。

@luotao1
Copy link
Contributor

luotao1 commented Dec 12, 2017

集成MKLDNN的设计文档在这里,RNN优化计划能否也写成一个设计文档呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants