Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prior box operator for ssd #6150

Merged
merged 12 commits into from
Jan 23, 2018
Merged

prior box operator for ssd #6150

merged 12 commits into from
Jan 23, 2018

Conversation

wanghaox
Copy link
Contributor

@wanghaox wanghaox commented Dec 1, 2017

resolve #6015

@wanghaox wanghaox requested review from kexinzhao, chengduoZH, qingqing01 and pkuyym and removed request for kexinzhao December 1, 2017 07:32
pkuyym
pkuyym previously requested changes Dec 6, 2017
Copy link
Contributor

@pkuyym pkuyym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make PriorBoxOp more general to support both Faster RCNN and SSD, please refer the design of TensorFlow. Thanks.


void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of SequenceSliceOp should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of SequenceSliceOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Image"),
"Input(Offset) of SequenceSliceOp should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

I think we only use the shape of image. Is it necessary to pass the full image ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
I think to pass the full image is easy for the user to use, and maybe easy to update algo. And the

"The format of input tensor is NCHW.");

auto min_sizes = ctx->Attrs().Get<std::vector<int>>("min_sizes");
auto max_sizes = ctx->Attrs().Get<std::vector<int>>("max_sizes");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any possible to make 'max_sizes' optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add SetDefault({}) to max_sizes at line 126

ctx->Attrs().Get<std::vector<float>>("aspect_ratios");
bool flip = ctx->Attrs().Get<bool>("flip");

PADDLE_ENFORCE_GT(min_sizes.size(), 0, "must provide min_size.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refine this comment and make sure the first character is upper-case.
I think 'Size of min_size must be at least 1.' is more accurate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE_GT(step_w, 0.0, "step_w should be larger than 0.");

const int layer_height = input_dims[3];
const int layer_width = input_dims[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape of input is NCHW or NCWH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

dim_vec[2] = layer_width * layer_height * num_priors * 4;
PADDLE_ENFORCE_GT(dim_vec[2], 0,
"output_dim[2] must larger than 0."
"check your data dims");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, please find the illegal input. For example, layer_width is illegal or layer_height is illegal etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, add PADDLE_ENFORCE(input_dims.size() == 4) and check layer_width or layer_height is smaller than image's.

framework::OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Input",
"(Tensor), "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give type info like default Tensor<float>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Input",
"(Tensor), "
"the input feature data of PriorBoxOp.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack shape information and no need to start in a newline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

@wanghaox wanghaox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update code


void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of SequenceSliceOp should not be null.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of SequenceSliceOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Image"),
"Input(Offset) of SequenceSliceOp should not be null.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
I think to pass the full image is easy for the user to use, and maybe easy to update algo. And the

"The format of input tensor is NCHW.");

auto min_sizes = ctx->Attrs().Get<std::vector<int>>("min_sizes");
auto max_sizes = ctx->Attrs().Get<std::vector<int>>("max_sizes");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add SetDefault({}) to max_sizes at line 126

ctx->Attrs().Get<std::vector<float>>("aspect_ratios");
bool flip = ctx->Attrs().Get<bool>("flip");

PADDLE_ENFORCE_GT(min_sizes.size(), 0, "must provide min_size.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE_GT(step_w, 0.0, "step_w should be larger than 0.");

const int layer_height = input_dims[3];
const int layer_width = input_dims[2];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

dim_vec[2] = layer_width * layer_height * num_priors * 4;
PADDLE_ENFORCE_GT(dim_vec[2], 0,
"output_dim[2] must larger than 0."
"check your data dims");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, add PADDLE_ENFORCE(input_dims.size() == 4) and check layer_width or layer_height is smaller than image's.

framework::OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Input",
"(Tensor), "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Input",
"(Tensor), "
"the input feature data of PriorBoxOp.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wanghaox wanghaox dismissed pkuyym’s stale review December 13, 2017 08:52

change at line 34

inline void expand_aspect_ratios(const std::vector<float> input_aspect_ratior,
bool flip,
std::vector<float>& output_aspect_ratior) {
constexpr float eps = 1e-6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eps建议epsilon,在重构后的版本中看到好像都用这个词。
另外这个建议做成参数

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddOutput("Out",
"(Tensor, default Tensor<float>), the output prior boxes of "
"PriorBoxOp. The format is [2, layer_height, layer_width, "
"num_priors, 4]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to explain 2, layer_height, layer_width, num_priors, 4. Need more explanation for the layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddAttr<bool>("clip", "(bool) ", "Whether to clip out-of-boundary boxes.")
.SetDefault(true);
AddAttr<int>("img_w", "").SetDefault(0);
AddAttr<int>("img_h", "").SetDefault(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need attrs of img_w and img_h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddAttr<std::vector<float>>(
"aspect_ratios", "(vector<float>) ",
"List of aspect ratios of generated prior boxes.")
.SetDefault({});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need .SetDefault({}) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddComment(R"DOC(
Prior box operator
Generate prior boxes for SSD(Single Shot MultiBox Detector) algorithm.
Please get more information from the following papers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to give more doc for this operator, how to generator box? how to calculate the number of prior boxes, and so on?

The TF doc is:

https://github.com/tensorflow/models/blob/master/research/object_detection/core/anchor_generator.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see any information in this comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace ops = paddle::operators;
REGISTER_OP_CUDA_KERNEL(
prior_box, ops::PriorBoxOpKernel<paddle::platform::CUDAPlace, float>,
ops::PriorBoxOpKernel<paddle::platform::CUDAPlace, double>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not implement the GPU, do not need to register GPU kernel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

const int layer_width = input->dims()[3];
const int layer_height = input->dims()[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int64_t, since the type in input->dims() is int64_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

output_tensor = &output_cpu;
} else {
output_tensor = out;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now support multi-device, there is no need to copy tensor from GPU to CPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的clip看下能不能调用Eigen的函数,而不是4层for循环。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,这种赋值,看看下能不能调用Eigen的函数,而不是4层for循环。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里用broadcast即可~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if len(self.variances) == 1:
output[1, h, w, i, j] = self.variances[0]
else:
output[1, h, w, i, j] = self.variances[j]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python单测需要改进,尽量向量操作,不用这么多层的for循环。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

@wanghaox wanghaox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update code

auto image_dims = ctx->GetInputDim("Image");
auto input_dims = ctx->GetInputDim("Input");
PADDLE_ENFORCE(image_dims.size() == 4,
"The format of input tensor is NCHW.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE(image_dims.size() == 4,
"The format of input tensor is NCHW.");
PADDLE_ENFORCE(input_dims.size() == 4,
"The format of input tensor is NCHW.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE_GT(variances[i], 0.0,
"variance[%d] must be greater than 0.", i);
}
} else if (variances.size() == 1) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::vector<int64_t> dim_vec(5);
dim_vec[0] = 2;
dim_vec[1] = layer_height;
dim_vec[2] = layer_width;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

dim_vec[3] = num_priors;
dim_vec[4] = 4;
auto output_dim = framework::make_ddim(dim_vec);
ctx->SetOutputDim("Out", output_dim);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddComment(R"DOC(
Prior box operator
Generate prior boxes for SSD(Single Shot MultiBox Detector) algorithm.
Please get more information from the following papers:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if len(self.variances) == 1:
output[1, h, w, i, j] = self.variances[0]
else:
output[1, h, w, i, j] = self.variances[j]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

inline void expand_aspect_ratios(const std::vector<float> input_aspect_ratior,
bool flip,
std::vector<float>& output_aspect_ratior) {
constexpr float eps = 1e-6;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddAttr<std::vector<float>>(
"aspect_ratios", "(vector<float>) ",
"List of aspect ratios of generated prior boxes.")
.SetDefault({});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove .SetDefault({}) too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

if (variances.size() > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems line 68 should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for (int w = 0; w < layer_width; ++w) {
float center_x = (w + offset) * step_width;
float center_y = (h + offset) * step_height;
float box_width, box_height;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template <typename Place, typename T>

If T is double, here also should be double.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

center_x + box_width / 2.) / self.image_w
# ymax
out_boxes[h, w, idx, 3] = (
center_y + box_height / 2.) / self.image_h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块代码可以简化下,

c_x = (w + self.offset) * self.step_w
c_y = (h + self.offset) * self.step_h
# ...
c_w = c_h = min_size/2.
out_boxes[h, w, idx, :] = [(c_x - c_w)/self.image_w, (c_y - c_h)/self.image_h, ..., ...]  
# ...

下面两处计算相同,可以使代码更短一些。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of PriorBoxOp should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not Input(X), should be Input(Input).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(X) of PriorBoxOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Image"),
"Input(Offset) of PriorBoxOp should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input(Offset) --> Input(Image)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


auto image_dims = ctx->GetInputDim("Image");
auto input_dims = ctx->GetInputDim("Input");
PADDLE_ENFORCE(image_dims.size() == 4, "The format of image is NCHW.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think The layout of data is NCHW is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

bool flip = ctx->Attrs().Get<bool>("flip");

PADDLE_ENFORCE_GT(min_sizes.size(), 0,
"Size of min_size must be at least 1.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_size --> min_sizes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

int num_priors = aspect_ratios_vec.size() * min_sizes.size();
if (max_sizes.size() > 0) {
PADDLE_ENFORCE_EQ(max_sizes.size(), min_sizes.size(),
"The length of min_size and max_size must be equal.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length --> number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


auto min_sizes = ctx->Attrs().Get<std::vector<int>>("min_sizes");
auto max_sizes = ctx->Attrs().Get<std::vector<int>>("max_sizes");
auto variances = ctx->Attrs().Get<std::vector<float>>("variances");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be variances optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here variances are needed for output "Variances"

boxes->data<T>(), clip_func);
}

Eigen::Tensor<T, 2, Eigen::RowMajor> var_et(1, variances.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more efficiency to use framework::Tensor whose memory is allocated from internal pool. @qingqing01 Please help to confirm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, should use Fluid's memory (or Tenosr) to allocate auxiliary workspace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddOutput("Boxes",
"(Tensor, default Tensor<float>), the output prior boxes of "
"PriorBoxOp. The layout is [layer_height, layer_width, "
"num_priors, 4]. layer_height is the height of input, "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

layer_height -> H,
layer_width -> W

same as below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto img_height = image->dims()[2];

auto layer_width = input->dims()[3];
auto layer_height = input->dims()[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

layer_width -> feature_width or width.

Same as layer_height

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wanghaox wanghaox merged commit 81be9ce into PaddlePaddle:develop Jan 23, 2018
@wanghaox wanghaox deleted the prior_box branch January 23, 2018 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port PriorBox Operator.
5 participants