Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/trt engine op test #11182

Merged
merged 10 commits into from
Jun 6, 2018

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Jun 5, 2018

  • pass a BlockDesc to tensorrt_engine_op
  • execute the whole phase

NEXT STEP:

write a tool to execute a larger model and output the benchmark.

@@ -14,7 +14,7 @@

#pragma once

#ifdef PADDLE_WITH_CUDA
#if PADDLE_WITH_CUDA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是ifdef

#include "paddle/fluid/inference/tensorrt/convert/op_converter.h"
#include "paddle/fluid/inference/tensorrt/convert/ut_helper.h"

USE_CPU_ONLY_OP(tensorrt_engine);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问为什么tensorrt_engine_op是CPU ONLY呢?不应该是GPU ONLY么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensorrt engine op 的kernel逻辑是在 cpu里跑,会触发 gpu

@@ -34,12 +35,15 @@ class OpConverter {

// Converter logic for an op.
virtual void operator()(const framework::proto::OpDesc& op,
const framework::Scope& scope) {}
const framework::Scope& scope,
bool test_mode = false) {}

// Convert a single fluid operaotr and add the corresponding layer to TRT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operaotr-》operator

@@ -37,12 +36,18 @@ class MulOpConverter : public OpConverter {
engine_, MatrixMultiply, *const_cast<nvinfer1::ITensor*>(input1), false,
*const_cast<nvinfer1::ITensor*>(input2), false);

engine_->DeclareOutput(layer, 0, op_desc.Output("Out")[0]);
auto output_name = op_desc.Output("Out")[0];
engine_->SetITensor(output_name, layer->getOutput(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 为什么单侧(test_mode)的时候不知道输出名字是什么呢?这个能否在之后的PR中去掉呢?
  • 如果是单侧的时候要用,可以改成unittest_mode? 不然会认为是测试阶段,而TRT只有前向过程,会造成困扰。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里避免硬编码

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Superjomn Superjomn merged commit 4f95bc9 into PaddlePaddle:develop Jun 6, 2018
@Superjomn Superjomn deleted the feature/trt_engine_op_test branch June 6, 2018 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants