Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hackthon_4th 177] Support PP-YOLOE-R with BM1684 #1809

Merged
merged 22 commits into from
Apr 21, 2023

Conversation

thunder95
Copy link
Contributor

PR types(PR类型)

Model

Description

将PP-YOLOE-R在算能BM1684部署

@thunder95
Copy link
Contributor Author

@DefTruth 巨佬,麻烦指导一下
只验证了gpu下的精度和速度, 和sophon的精度
labels diff, max: 0, min: 0, mean: 0
scores diff, max: 0.000332654, min: 1.07288e-06, mean: 4.85186e-05
boxes diff, max: 0.107788, min: 0, mean: 0.00350611
但是设备上benchmark是0ms

@DefTruth
Copy link
Collaborator

LGTM~

@DefTruth
Copy link
Collaborator

DefTruth commented Apr 13, 2023

@thunder95 辛苦提供下预测后的可视化结果图哈,直接贴到PR的comments即可

@DefTruth DefTruth self-requested a review April 13, 2023 02:32
@thunder95
Copy link
Contributor Author

@thunder95 辛苦提供下预测后的可视化结果图哈,直接贴到PR的comments即可

infer_sophgo

@thunder95
Copy link
Contributor Author

@thunder95 辛苦提供下预测后的可视化结果图哈,直接贴到PR的comments即可

@DefTruth 老师 已贴图

@DefTruth DefTruth changed the title [hackthon_4th 177]将PP-YOLOE-R在算能BM1684部署 [Hackthon_4th 177] Support PP-YOLOE-R with BM1684 Apr 13, 2023
res.emplace_back(std::stof(s.substr(pos1)));
}

void showDiffStats(const std::vector<float>& data, const std::string& title) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数首字母大写

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

}
double sum = accumulate(begin(data), end(data), 0.0);
double mean = sum / data.size();
max = *max_element(data.begin(), data.end());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

标准库的使用均使用std限制,类型转换采用c++的风格 static_cast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

<< ", mean: " << average << std::endl;
}

void sortBoxes(vision::DetectionResult* result, std::vector<int>* indices) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

auto params_file = FLAGS_model + sep + params_name;
auto config_file = FLAGS_model + sep + config_name;

auto model_ppyoloe_r = vision::detection::PPYOLOE_R(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c++类名不采用下划线分隔,需要修改成 PPYOLOER

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

model_file, params_file, config_file, option, model_format);

vision::DetectionResult res;
if (config_info["precision_compare"] == "true") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段精度验证的逻辑可以先删除或留空,后续FD这边会统一增加这段精度的逻辑。将该精度验证代码在PR中说明即可。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

@@ -264,6 +264,57 @@ bool PaddleDetPostprocessor::ProcessSolov2(
return true;
}

bool PaddleDetPostprocessor::ProcessPPYOLOE_R(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数名ProcessPPYOLOE_R->ProcessPPYOLOE_R

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

@@ -91,6 +96,10 @@ class FASTDEPLOY_DECL PaddleDetPostprocessor {
bool ProcessSolov2(const std::vector<FDTensor>& tensors,
std::vector<DetectionResult>* results);

// Process PPYOLOE_R
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考SetNMSOption,需要增加SetNMSRotatedOption方法

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

@@ -125,6 +164,45 @@ cv::Mat VisDetection(const cv::Mat& im, const DetectionResult& result,
int h = im.rows;
int w = im.cols;
auto vis_im = im.clone();
for (size_t i = 0; i < result.rotated_boxes.size(); ++i) {
printf("result score: %f, %f\n", result.scores[i], score_threshold);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句log删除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

@@ -38,6 +38,45 @@ cv::Mat VisDetection(const cv::Mat& im, const DetectionResult& result,
int h = im.rows;
int w = im.cols;
auto vis_im = im.clone();
for (size_t i = 0; i < result.rotated_boxes.size(); ++i) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段逻辑看起来和后边的新增的是一样的,处理的是有什么区别吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DefTruth 逻辑是相同的,这里参考boxes简单拷贝了一份逻辑在两个不同的函数里。下面的函数支持传自定义标签const std::vectorstd::string& labels

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

@@ -781,3 +781,40 @@ def __init__(self,
config_file, self._runtime_option,
model_format)
assert self.initialized, "GFL model initialize failed."


class PPYOLOE_R(PPYOLOE):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类名 PPYOLOE_R -> PPYOLOER

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已完成

@DefTruth
Copy link
Collaborator

DefTruth commented Apr 13, 2023

关于benchmark耗时为0的问题,是因为fd目前还没支持sophgo后端的benchmark功能。可以参考Paddle Lite后端中的写法,增加推理benchmark宏,对推理块进行包裹。

RUNTIME_PROFILE_LOOP_BEGIN(1)

更详细的说明,还可以参考这个PR:

编译带benchmark功能的SDK,需要-DENABLE_BENCHMARK=ON

@thunder95
Copy link
Contributor Author

@DefTruth 在sophgo设备上运行成功benchmark
[INFO] fastdeploy/vision/common/processors/transform.cc(159)::FuseNormalizeColorConvert BGR2RGB and Normalize are fused to Normalize with swap_rb=1
[WARNING] fastdeploy/fastdeploy_model.cc(48)::IsSupported In benchmark mode, we don't check to see if the backend [Backend::SOPHGOTPU] is supported for current model!
[BMRT][bmcpu_setup:349] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init
[BMRT][load_bmodel:1027] INFO:Loading bmodel from [ppyoloe_r_crn_s_3x_dota//ppyoloe_r_crn_s_3x_dota_1684x_f32.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:991] INFO:pre net num: 0, load net num: 1
[INFO] fastdeploy/runtime/runtime.cc(347)::CreateSophgoNPUBackend Runtime initialized with Backend::SOPHGO in Device::SOPHGOTPUD.
[INFO] fastdeploy/runtime/backends/sophgo/sophgo_backend.cc(202)::Infer Running profiling for Runtime without H2D and D2H, Repeats: 200, Warmup: 50
Runtime(ms): 70.2644ms.
Visualized result saved in ./vis_result.jpg

@DefTruth
Copy link
Collaborator

LGTM~

@@ -55,6 +58,12 @@ class FASTDEPLOY_DECL PaddleDetPostprocessor {
/// only available for those model exported without box decoding and nms.
void ApplyNMS() { with_nms_ = false; }

/// If you do not want to modify the Yaml configuration file,
/// you can use this function to set rotated NMS parameters.
void SetNMSRotatedOption(const NMSRotatedOption& option) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口需要在ppdet_pybind中绑定下,以便在python中调用。同理NMSRotatedOption,也需要在pybind中绑定下。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DefTruth 已完成

@DefTruth
Copy link
Collaborator

DefTruth commented Apr 14, 2023

@thunder95 CI似乎都挂了,需要检查下pybind的逻辑,看起来是命名空间没索引正确。

image

@thunder95 thunder95 requested a review from DefTruth April 14, 2023 10:28
@thunder95
Copy link
Contributor Author

@thunder95 CI似乎都挂了,需要检查下pybind的逻辑,看起来是命名空间没索引正确。

image

@DefTruth 已完成修改

@@ -272,6 +323,10 @@ bool PaddleDetPostprocessor::Run(const std::vector<FDTensor>& tensors,
// The fourth output of solov2 is mask
return ProcessMask(tensors[3], results);
} else {
if (tensors[0].Shape()[2] == 8) { // PPYOLOER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个shape是否一定是3维度?picodet等的输出tensor一般是2个维度,索引2会取到一个随机值

@DefTruth DefTruth merged commit 51be3fe into PaddlePaddle:develop Apr 21, 2023
@851532562
Copy link

@thunder95 您好,请问下您使用的ppyoloe_r模型提交到算能model_zoo了吗,我找了好久没找到,您方便发我一下吗?发邮箱(851532562@qq.com)也行。非常感谢!

@thunder95
Copy link
Contributor Author

@851532562 您好,我试着找了下,没有找到该模型,之前借的设备也还了。这个转模型过程相对也比较简单,先导出onnx,然后用最新版mlir转bmodel,老版本的可能会遇到部分算子不支持。

@851532562
Copy link

@851532562 您好,我试着找了下,没有找到该模型,之前借的设备也还了。这个转模型过程相对也比较简单,先导出onnx,然后用最新版mlir转bmodel,老版本的可能会遇到部分算子不支持。

嗯,我按照文档指示能完成模型转换过程,在bm1684x上也能跑起来,不过结果有一些问题,该识别出来的没识别到,有些不对的框置信度还挺高,您遇到过这样的问题吗?
微信图片_20230728

@851532562
Copy link

@thunder95 麻烦帮忙看看哦,感谢!

@thunder95
Copy link
Contributor Author

@851532562 这是自己训练的模型吗,有跟python跑出的结果进行对比吗?之前我测试官网的模型,是有轻微的差别,主要原因是模型移植过程中模型本身会有轻微精度损失,另外,旋转框后处理的实现上也有差别,官网原始实现是有一个cuda算子,具体名称忘记了,我参考的是deploy里numpy处理的逻辑进行移植到fastdeploy。

@851532562
Copy link

@thunder95 您说的python跑的结果是指用paddledetection跑的吗,我等下对比一下。我这个用的是paddledetection导出的,不是自己训练的,导出的时候去掉了最后的multiclass_nms算子,其他的都是按fastdeploy sophgo文档操作的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants