Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine MKLDNNLayer and MKLDNNTester #4011

Merged
merged 4 commits into from
Sep 12, 2017
Merged

Conversation

tensor-tang
Copy link
Contributor

The addition for #4008

  1. remove copyOutputInfoToOtherDevice
  2. move forward and backward to MKLDNNLayer
  3. add reshapeInput() and reshapeOutput() in MKLDNNLayer and use them at child layer.
  4. add UpdateCallback for MKLDNNTester , since the weight not updated before.

if (biases_ && biases_->getWGrad()) {
biases_->getParameterPtr()->incUpdate(callback);
}
void MKLDNNFcLayer::updateWeights(const UpdateCallback& callback) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateWeights的实现,每个MKLDNNXXXLayer都一样吧,那没必要每个子类重新实现一遍了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里考虑到的是,有的MKLDNNLayer不一定会有weight,比如pooling

REGISTER_TIMER_INFO("FwActTimer", getName().c_str());
forwardActivation();
}
inVal_->setData(getInputValue(0, CPU_DEVICE)->getData());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inVal_ 在MKLDNNLayer.h中定义了,那227行也可以挪到父类中实现。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不好挪到父类,不同的子类不一定都是用这个方法,交由每个子类自己去负责,只是恰好这里fc是用这个变量,比如conv会用不一样的变量。
所以父类只给了一个空的实现,也考虑到,未来可能有的layer就不需要override这个函数,这点由子类自己去负责。

{
REGISTER_TIMER_INFO("mkldnn_FwdTimer", getName().c_str());
copySeqInfoToOutputs();
CHECK(!inputLayers_.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

114行应该在115行的check后再做。不然如果inputLayer为空,114行就直接出core了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原来的考虑是,114行的函数,是可以允许inputLayer_为空的,就直接return了,是不会core dump的,只是115行后面的内容不希望他为空,所以才是这个顺序。

不过,其实整体上,forward这个函数都是希望input不要为空的,放到前面也可以的。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我看到在copySeqInfoToOutputs函数里面,有对inputLayer_是否为空的判断。既然整体上,forward函数不能有input为空,那么115行要放到114行前面,然后copySeqInfoToOutputs里对inputLayer_空的判断可以去掉,只留下needSequenceInfo_的判断。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以把CHECK放到前面,但是并不推荐把函数里面的判断去掉~因为这个是函数自己的保证。

CHECK(!inputLayers_.empty());
size_t elemenCnt = inputLayers_[0]->getOutput().value->getElementCnt();
if (inputElemenCnt_ != elemenCnt) {
inputElemenCnt_ = elemenCnt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段不是很理解,确切说是inputElemenCnt_这个参数不理解。为什么需要用value的矩阵个数来控制119-122行呢?
如果为了做reshape, resetFwd 和convertWeightsFromPaddle, 是因为上一层不是MKLDNN的layer?那可以换成其他参数么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果为了做reshape, resetFwd 和convertWeightsFromPaddle, 是因为上一层不是MKLDNN的layer?

这里的考虑不是因为上一层是不是MKLDNN的关系,而是上一层的输出大小发生变化才需要做。

在改之前,原先是只考虑了batchsize变化才做,现在换为输入的大小,包含了batchsize的概念,考虑到conv允许image size变化,他并不会体现在batchsize的变化,所以整体上,只需要判断输入大小inputElemenCnt_的变化。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请将这段comment写入注释。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK~

* Update input value data when input layer is "data" type.
* Since the input value data address might be changed.
*/
virtual void updateInputData() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reshape, resetFwd, resetBwd和updateInputData。最好带有参数:

  • 这样不用看具体实现也能知道这个函数的输入是什么,又修改了什么。现在全是空,代码不容易看。同时,在注释中也能说明各个参数的含义。
  • 因为子类会实现resetFwd和resetBwd等,如果父类中把接口给定好了,以后写子类也更加规范。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一点我也同意,只不过如果原先考虑到,如果要加入参数的话,要加的参数个数比较多,并且这些参数都是父类定义过,所以觉得写在参数就有些多余了,比如reshape(int &bs, int &ih, int &iw, int &oh, int &ow)

不过也好,可以加入一些必要的且能说明意义的参数,我待会改了你再review看看。

CHECK(dnnLayer_);
// for comparison with Paddle reference results,
// need manually add cpu device output for test
dnnLayer_->addOutputArgument(-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 用CPU_DEVICE代替

@@ -109,20 +113,21 @@ void MKLDNNTester::randomBotDatas() {

void MKLDNNTester::randomTopDiffs() {
refLayer_->getOutputGrad()->randomizeUniform();
dnnLayer_->getOutputGrad()->copyFrom(*(refLayer_->getOutputGrad()));
VLOG(lvl_) << "Random dom Backward Input, TopDiff: ";
dnnLayer_->getOutput(-1).grad->copyFrom(*(refLayer_->getOutputGrad()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 用CPU_DEVICE代替

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个需要include layer.h,这里没有定义的

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那include进来,-1写死不利于后续维护。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的~

VLOG(MKLDNN_ALL) << "Check Forward";
printTopDatas();
double delta = compareMatrix(dnnLayer_->getOutput(-1).value,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 用CPU_DEVICE代替

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

const VectorPtr& grad = parameters_[n][i]->getBuf(PARAMETER_GRADIENT);
if (grad) {
grad->zeroMem();
if (id == n || id == parameters_.size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是很理解,为什么要加198行,即clearWgtDiffs(id)。原来全部clear有什么问题么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

哦,不是原来的问题。
这里是给一个参数选择,把函数接口作统一了,可以选择性的clear。

@luotao1
Copy link
Contributor

luotao1 commented Sep 11, 2017

目前的实现,是假设MKLDNNXXXLayer都只有一个input layer的情况么?

Copy link
Contributor Author

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,目前暂时只考虑只有一个输入的情况。

REGISTER_TIMER_INFO("FwActTimer", getName().c_str());
forwardActivation();
}
inVal_->setData(getInputValue(0, CPU_DEVICE)->getData());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不好挪到父类,不同的子类不一定都是用这个方法,交由每个子类自己去负责,只是恰好这里fc是用这个变量,比如conv会用不一样的变量。
所以父类只给了一个空的实现,也考虑到,未来可能有的layer就不需要override这个函数,这点由子类自己去负责。

if (biases_ && biases_->getWGrad()) {
biases_->getParameterPtr()->incUpdate(callback);
}
void MKLDNNFcLayer::updateWeights(const UpdateCallback& callback) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里考虑到的是,有的MKLDNNLayer不一定会有weight,比如pooling

{
REGISTER_TIMER_INFO("mkldnn_FwdTimer", getName().c_str());
copySeqInfoToOutputs();
CHECK(!inputLayers_.empty());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原来的考虑是,114行的函数,是可以允许inputLayer_为空的,就直接return了,是不会core dump的,只是115行后面的内容不希望他为空,所以才是这个顺序。

不过,其实整体上,forward这个函数都是希望input不要为空的,放到前面也可以的。

CHECK(!inputLayers_.empty());
size_t elemenCnt = inputLayers_[0]->getOutput().value->getElementCnt();
if (inputElemenCnt_ != elemenCnt) {
inputElemenCnt_ = elemenCnt;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果为了做reshape, resetFwd 和convertWeightsFromPaddle, 是因为上一层不是MKLDNN的layer?

这里的考虑不是因为上一层是不是MKLDNN的关系,而是上一层的输出大小发生变化才需要做。

在改之前,原先是只考虑了batchsize变化才做,现在换为输入的大小,包含了batchsize的概念,考虑到conv允许image size变化,他并不会体现在batchsize的变化,所以整体上,只需要判断输入大小inputElemenCnt_的变化。

* Update input value data when input layer is "data" type.
* Since the input value data address might be changed.
*/
virtual void updateInputData() {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一点我也同意,只不过如果原先考虑到,如果要加入参数的话,要加的参数个数比较多,并且这些参数都是父类定义过,所以觉得写在参数就有些多余了,比如reshape(int &bs, int &ih, int &iw, int &oh, int &ow)

不过也好,可以加入一些必要的且能说明意义的参数,我待会改了你再review看看。

@@ -109,20 +113,21 @@ void MKLDNNTester::randomBotDatas() {

void MKLDNNTester::randomTopDiffs() {
refLayer_->getOutputGrad()->randomizeUniform();
dnnLayer_->getOutputGrad()->copyFrom(*(refLayer_->getOutputGrad()));
VLOG(lvl_) << "Random dom Backward Input, TopDiff: ";
dnnLayer_->getOutput(-1).grad->copyFrom(*(refLayer_->getOutputGrad()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个需要include layer.h,这里没有定义的

VLOG(MKLDNN_ALL) << "Check Forward";
printTopDatas();
double delta = compareMatrix(dnnLayer_->getOutput(-1).value,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

const VectorPtr& grad = parameters_[n][i]->getBuf(PARAMETER_GRADIENT);
if (grad) {
grad->zeroMem();
if (id == n || id == parameters_.size()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

哦,不是原来的问题。
这里是给一个参数选择,把函数接口作统一了,可以选择性的clear。

@tensor-tang
Copy link
Contributor Author

Done

@tensor-tang tensor-tang force-pushed the refine branch 2 times, most recently from e980a6d to 7f7fa32 Compare September 12, 2017 02:15
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit 171fee2 into PaddlePaddle:develop Sep 12, 2017
@tensor-tang tensor-tang deleted the refine branch September 12, 2017 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants