Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance ops to support LoD as input for dygraph detection models. #25316

Merged
merged 10 commits into from
Sep 8, 2020

Conversation

jerrywgz
Copy link
Contributor

@jerrywgz jerrywgz commented Jul 1, 2020

PR types

Function optimization

PR changes

OPs

Describe

Enhance ops related to object detection to support LoD as input. Add RoIsNum for input and output which are dispensable and compatible with inference of previous Paddle version. These new input and output are only used in dygraph detection models which are not released yet.

These ops add new input and output and faster_rcnn_r50_fpn_1x has been verified that the latest paddle version could use the inference model from previous paddle version correctly.

预测兼容性测试:

测试方法:
在PaddleDetection中使用 export_model.py 导出模型,使用deploy/python/infer.py进行预测,对比检测结果。测试模型为faster_rcnn_r50_fpn_1x 其中包含了所有改动涉及到的op

case1:
使用paddle1.8.4版本保存模型,在1.8.4版本和develop版本分别预测
1.8.4版本预测结果:
image
develop版本预测结果:
image

case2:
使用develop版本保存模型,在1.8.4版本和develop版本分别预测
1.8.4版本预测结果:
image
develop版本预测结果:
image

测试结论:
新增输入输出是兼容老版本的

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jul 1, 2020

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@jerrywgz jerrywgz changed the title enhance collect_op for dygraph, test=develop enhance collect_fpn_proposals_op for dygraph, test=develop Jul 1, 2020
@jerrywgz jerrywgz changed the title enhance collect_fpn_proposals_op for dygraph, test=develop enhance ops for dygraph in FPN models, test=develop Jul 2, 2020
AddInput(
"MultiLevelNums",
"(Tensor) Multiple RoIs number of each image from each level in shape"
"(N), N is the number of images.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments不通顺。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

const int* cur_rois_num = multi_rois_num[i]->data<int>();
for (int k = 0; k < multi_rois_num[i]->numel(); k++) {
all_rois += cur_rois_num[k];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also can use std::accumulate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx

AddOutput("MultiFpnRois", "(LoDTensor) Output with distribute operator")
.AsDuplicable();
AddOutput("RestoreIndex",
"(Tensor) An array of positive number which is "
"used to restore the order of FpnRois");
AddOutput("MultiRoisNum",
"(Tensor) Multiple number of RoIs from each level in shape (B),"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是一个List of Tensor吧,而不是一个Tensor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

std::vector<framework::DDim> outs_num_dims;
for (size_t i = 0; i < num_out_rois; ++i) {
framework::DDim out_num_dim = {-1};
outs_num_dims.push_back(out_num_dim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

55行可以去掉, 56行: push_back({-1});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -28,6 +28,21 @@ namespace operators {

const int kBoxDim = 4;

inline std::vector<size_t> get_lod_from_rois_num(const Tensor* rois_num) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx

@@ -3562,6 +3564,12 @@ def distribute_fpn_proposals(fpn_rois,
name(str, optional): For detailed information, please refer
to :ref:`api_guide_Name`. Usually name is no need to set and
None by default.
rois_num(Variable): 1-D Tensor with shape [B] and data type is int32.
B is the number os images. The number of RoIs in each image.
return_rois_num(bool): When setting True, it will return a list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当输入包含rois_num时,是否一定会返回,如果是,可以去掉这个bool控制

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -3574,6 +3582,10 @@ def distribute_fpn_proposals(fpn_rois,
the number of total rois. The data type is int32. It is
used to restore the order of fpn_rois.

multi_rois_num(List): A list of 1-D Tensor with shape [B]
and data type of int32. B is the number of images. The number of RoIs
in each image from each level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最后一句没有谓语

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -3720,14 +3745,19 @@ def collect_fpn_proposals(multi_rois,
post_nms_top_n(int): The number of selected RoIs
name(str, optional): For detailed information, please refer
to :ref:`api_guide_Name`. Usually name is no need to set and
None by default.
None by default.
multi_rois_num(list, optional): List of the number of RoIs in each image from each level. Element in list is 1-D Tensor with shape [B] and data type is int32, B is the number of images. Default: None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同样,第一句无法不通顺,没有谓语。 multi_rois_num 这个名字觉得不是很直观

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改为rois_num_per_level

qingqing01
qingqing01 previously approved these changes Aug 17, 2020
@jerrywgz jerrywgz force-pushed the enhance_fpn_ops branch 2 times, most recently from 213096f to 8df7a50 Compare August 19, 2020 11:07
@qingqing01 qingqing01 changed the title enhance ops for dygraph in FPN models, test=develop Enhance ops to support LoD as input for dygraph detection models. Aug 20, 2020
qingqing01
qingqing01 previously approved these changes Aug 20, 2020
@jerrywgz
Copy link
Contributor Author

image
image
image
image
image
image
image
image
image
image
image
image

@@ -481,7 +491,8 @@ class GenerateProposalsOpMaker : public framework::OpProtoAndCheckerMaker {
"(LoDTensor), Output proposals with shape (rois_num, 4).");
AddOutput("RpnRoiProbs",
"(LoDTensor) Scores of proposals with shape (rois_num, 1).");
AddOutput("RpnRoisLod", "(Tensor), rpn rois's lod info").AsDispensable();
AddOutput("RpnRoisNum", "(Tensor), The number of Rpn RoIs in each image")
.AsDispensable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么将输出变量RpnRoisLod删除,并新增RpnRoisNum呢?
为了保证新版本的Paddle预测库能成功加载旧版本训练的模型,当前要求op的输入输出只能做兼容性修改,不能删除已有的输入输出~
可参考:https://github.com/PaddlePaddle/Paddle/wiki/OP-Input-Output-Attribute-Compatibility-Modification

Copy link
Contributor Author

@jerrywgz jerrywgz Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

关于删除RoisLod相关输入输出问题统一回复:为多个相关api, op的接口保持统一,而且实际输入的值的含义并不是lod而是num。修改后的op主要是为动态图(不支持lod的地方用的),目前release的模型都是静态图版本。同时这些op都是检测模型专有op,其他场景也基本不会用到。

"(Tensor), "
"The lod info of rois.")
"The number of RoIs in each image.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,会引入预测库不兼容风险

@@ -140,7 +141,8 @@ class ROIPoolOpMaker : public framework::OpProtoAndCheckerMaker {
"Where batch_id is the id of the data, "
"(x1, y1) is the top left coordinates, and "
"(x2, y2) is the bottom right coordinates.");
AddInput("RoisLod", "(Tensor), The lod info of rois.").AsDispensable();
AddInput("RoisNum", "(Tensor), The number of RoIs in each image.")
.AsDispensable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,会引入预测库不兼容风险

for i in range(num_lvl)
]
outputs['MultiLevelRoIsNum'] = rois_num_per_level

helper.append_op(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call core.ops.distribute_fpn_proposals here for better performance in dygraph mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx

zhiqiu
zhiqiu previously approved these changes Sep 1, 2020
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for op_function_generator.cc

num_lvl = max_level - min_level + 1

if in_dygraph_mode():
assert rois_num is not None, "rois_num should not be None in dygraph mode."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding the reason why rois_num should not be None in error message.
Otherwise, users may be confused when seeing the error.

Not urgent, you can refine it in the next PR. Same for other APIs.

Heeenrrry
Heeenrrry previously approved these changes Sep 1, 2020
Copy link
Contributor

@Heeenrrry Heeenrrry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jerrywgz jerrywgz dismissed stale reviews from Heeenrrry and zhiqiu via ce62886 September 2, 2020 11:24
Copy link
Contributor

@Superjomn Superjomn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu self-requested a review September 7, 2020 02:49
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Heeenrrry Heeenrrry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants