New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Hackthon_4th 177] Support PP-YOLOE-R with BM1684 #1809

Merged

DefTruth merged 22 commits into PaddlePaddle:develop from thunder95:bm1684x_yoloe_r

Apr 21, 2023

Contributor

thunder95 commented Apr 12, 2023

PR types(PR类型)

Model

Description

将PP-YOLOE-R在算能BM1684部署

thunder95 added 6 commits

April 11, 2023 00:03


          first draft

9b36495


          add robx iou

9bd7acc


          Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

dc455ab

…into bm1684x_yoloe_r


          add benchmark for ppyoloe_r

538229c


          Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

4abcf8c

…into bm1684x_yoloe_r


          remove trash code

5b64737

Contributor Author

thunder95 commented Apr 12, 2023

@DefTruth 巨佬，麻烦指导一下
只验证了gpu下的精度和速度，和sophon的精度
labels diff, max: 0, min: 0, mean: 0
scores diff, max: 0.000332654, min: 1.07288e-06, mean: 4.85186e-05
boxes diff, max: 0.107788, min: 0, mean: 0.00350611
但是设备上benchmark是0ms

thunder95 mentioned this pull request

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#51281

Closed

Collaborator

DefTruth commented Apr 13, 2023

LGTM~

Collaborator

DefTruth commented Apr 13, 2023 •

edited

Loading

@thunder95 辛苦提供下预测后的可视化结果图哈，直接贴到PR的comments即可

DefTruth self-requested a review

April 13, 2023 02:32

Contributor Author

thunder95 commented Apr 13, 2023

@thunder95 辛苦提供下预测后的可视化结果图哈，直接贴到PR的comments即可

Contributor Author

thunder95 commented Apr 13, 2023

@thunder95 辛苦提供下预测后的可视化结果图哈，直接贴到PR的comments即可

@DefTruth 老师已贴图

DefTruth changed the title ~~[hackthon_4th 177]将PP-YOLOE-R在算能BM1684部署~~ [Hackthon_4th 177] Support PP-YOLOE-R with BM1684


          Merge branch 'develop' into bm1684x_yoloe_r

68394f1

DefTruth requested changes

View reviewed changes

benchmark/cpp/benchmark_ppyoloe_r.cc Outdated

+                res.emplace_back(std::stof(s.substr(pos1)));
+              }
+              void showDiffStats(const std::vector<float>& data, const std::string& title) {

Collaborator

DefTruth Apr 13, 2023

函数首字母大写

Contributor Author

thunder95 Apr 13, 2023

已完成

benchmark/cpp/benchmark_ppyoloe_r.cc Outdated

+                }
+                double sum = accumulate(begin(data), end(data), 0.0);
+                double mean = sum / data.size();
+                max = *max_element(data.begin(), data.end());

Collaborator

DefTruth Apr 13, 2023

标准库的使用均使用std限制，类型转换采用c++的风格 static_cast

Contributor Author

thunder95 Apr 13, 2023

已完成

benchmark/cpp/benchmark_ppyoloe_r.cc Outdated

+                          << ", mean: " << average << std::endl;
+              }
+              void sortBoxes(vision::DetectionResult* result, std::vector<int>* indices) {

Collaborator

DefTruth Apr 13, 2023

同上

Contributor Author

thunder95 Apr 13, 2023

已完成

benchmark/cpp/benchmark_ppyoloe_r.cc Outdated

+                auto params_file = FLAGS_model + sep + params_name;
+                auto config_file = FLAGS_model + sep + config_name;
+                auto model_ppyoloe_r = vision::detection::PPYOLOE_R(

Collaborator

DefTruth Apr 13, 2023

c++类名不采用下划线分隔，需要修改成 PPYOLOER

Contributor Author

thunder95 Apr 13, 2023

已完成

benchmark/cpp/benchmark_ppyoloe_r.cc Outdated

+                    model_file, params_file, config_file, option, model_format);
+                vision::DetectionResult res;
+                if (config_info["precision_compare"] == "true") {

Collaborator

DefTruth Apr 13, 2023

这段精度验证的逻辑可以先删除或留空，后续FD这边会统一增加这段精度的逻辑。将该精度验证代码在PR中说明即可。

Contributor Author

thunder95 Apr 13, 2023

已完成

fastdeploy/vision/detection/ppdet/postprocessor.cc Outdated

@@ @@ -264,6 +264,57 @@ bool PaddleDetPostprocessor::ProcessSolov2( @@
                 return true;
               }
+              bool PaddleDetPostprocessor::ProcessPPYOLOE_R(

Collaborator

DefTruth Apr 13, 2023

函数名ProcessPPYOLOE_R->ProcessPPYOLOE_R

Contributor Author

thunder95 Apr 13, 2023

已完成

fastdeploy/vision/detection/ppdet/postprocessor.h Outdated

@@ @@ -91,6 +96,10 @@ class FASTDEPLOY_DECL PaddleDetPostprocessor { @@
                 bool ProcessSolov2(const std::vector<FDTensor>& tensors,
                                    std::vector<DetectionResult>* results);
+                // Process PPYOLOE_R

Collaborator

DefTruth Apr 13, 2023

参考SetNMSOption，需要增加SetNMSRotatedOption方法

Contributor Author

thunder95 Apr 13, 2023

已完成

fastdeploy/vision/visualize/detection.cc Outdated

                 int h = im.rows;
                 int w = im.cols;
                 auto vis_im = im.clone();
+                for (size_t i = 0; i < result.rotated_boxes.size(); ++i) {
+                  printf("result score: %f, %f\n", result.scores[i], score_threshold);

Collaborator

DefTruth Apr 13, 2023

这句log删除

Contributor Author

thunder95 Apr 13, 2023

已完成

fastdeploy/vision/visualize/detection.cc

                 int h = im.rows;
                 int w = im.cols;
                 auto vis_im = im.clone();
+                for (size_t i = 0; i < result.rotated_boxes.size(); ++i) {

Collaborator

DefTruth Apr 13, 2023

这段逻辑看起来和后边的新增的是一样的，处理的是有什么区别吗？

Contributor Author

thunder95 Apr 13, 2023

@DefTruth 逻辑是相同的，这里参考boxes简单拷贝了一份逻辑在两个不同的函数里。下面的函数支持传自定义标签const std::vectorstd::string& labels

Collaborator

DefTruth Apr 14, 2023

好的

python/fastdeploy/vision/detection/ppdet/__init__.py Outdated

@@ @@ -781,3 +781,40 @@ def __init__(self, @@
                                                            config_file, self._runtime_option,
                                                            model_format)
                       assert self.initialized, "GFL model initialize failed."
+              class PPYOLOE_R(PPYOLOE):

Collaborator

DefTruth Apr 13, 2023

类名 PPYOLOE_R -> PPYOLOER

Contributor Author

thunder95 Apr 13, 2023

已完成

Collaborator

DefTruth commented Apr 13, 2023 •

edited

Loading

关于benchmark耗时为0的问题，是因为fd目前还没支持sophgo后端的benchmark功能。可以参考Paddle Lite后端中的写法，增加推理benchmark宏，对推理块进行包裹。

FastDeploy/fastdeploy/runtime/backends/lite/lite_backend.cc

Line 202 in b30f62a

RUNTIME_PROFILE_LOOP_BEGIN(1)

更详细的说明，还可以参考这个PR：

[Backend] support bechmark mode for runtime and backend #1201

编译带benchmark功能的SDK，需要-DENABLE_BENCHMARK=ON

Contributor Author

thunder95 commented Apr 13, 2023

@DefTruth 在sophgo设备上运行成功benchmark
[INFO] fastdeploy/vision/common/processors/transform.cc(159)::FuseNormalizeColorConvert BGR2RGB and Normalize are fused to Normalize with swap_rb=1
[WARNING] fastdeploy/fastdeploy_model.cc(48)::IsSupported In benchmark mode, we don't check to see if the backend [Backend::SOPHGOTPU] is supported for current model!
[BMRT][bmcpu_setup:349] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init
[BMRT][load_bmodel:1027] INFO:Loading bmodel from [ppyoloe_r_crn_s_3x_dota//ppyoloe_r_crn_s_3x_dota_1684x_f32.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:991] INFO:pre net num: 0, load net num: 1
[INFO] fastdeploy/runtime/runtime.cc(347)::CreateSophgoNPUBackend Runtime initialized with Backend::SOPHGO in Device::SOPHGOTPUD.
[INFO] fastdeploy/runtime/backends/sophgo/sophgo_backend.cc(202)::Infer Running profiling for Runtime without H2D and D2H, Repeats: 200, Warmup: 50
Runtime(ms): 70.2644ms.
Visualized result saved in ./vis_result.jpg

thunder95 and others added 3 commits

April 13, 2023 23:49


          fix bugs

631762f


          Merge branch 'bm1684x_yoloe_r' of https://github.com/thunder95/FastDe…

8c3a35c

…ploy into bm1684x_yoloe_r


          Merge branch 'develop' into bm1684x_yoloe_r

907e302

Collaborator

DefTruth commented Apr 14, 2023

LGTM~

DefTruth requested changes

View reviewed changes

fastdeploy/vision/detection/ppdet/postprocessor.h

@@ @@ -55,6 +58,12 @@ class FASTDEPLOY_DECL PaddleDetPostprocessor { @@
                 /// only available for those model exported without box decoding and nms.
                 void ApplyNMS() { with_nms_ = false; }
+                /// If you do not want to modify the Yaml configuration file,
+                /// you can use this function to set rotated NMS parameters.
+                void SetNMSRotatedOption(const NMSRotatedOption& option) {

Collaborator

DefTruth Apr 14, 2023

这个接口需要在ppdet_pybind中绑定下，以便在python中调用。同理NMSRotatedOption，也需要在pybind中绑定下。

Contributor Author

thunder95 Apr 14, 2023

@DefTruth 已完成

thunder95 added 2 commits

April 14, 2023 13:37


          add pybind nms rotated option

fdb72b6


          Merge branch 'bm1684x_yoloe_r' of https://github.com/thunder95/FastDe…

354e75c

…ploy into bm1684x_yoloe_r

Collaborator

DefTruth commented Apr 14, 2023 •

edited

Loading

@thunder95 CI似乎都挂了，需要检查下pybind的逻辑，看起来是命名空间没索引正确。

thunder95 and others added 4 commits

April 14, 2023 15:29


          add missing head file

a22a338


          Merge branch 'develop' into bm1684x_yoloe_r

5d535a6


          fix bug

2b25e4c


          Merge branch 'bm1684x_yoloe_r' of https://github.com/thunder95/FastDe…

890176b

…ploy into bm1684x_yoloe_r


          fix bug2

thunder95 requested a review from DefTruth

April 14, 2023 10:28

Contributor Author

thunder95 commented Apr 14, 2023

@thunder95 CI似乎都挂了，需要检查下pybind的逻辑，看起来是命名空间没索引正确。

@DefTruth 已完成修改

DefTruth added 2 commits

April 19, 2023 16:19


          Merge branch 'develop' into bm1684x_yoloe_r

a3760d4


          Merge branch 'develop' into bm1684x_yoloe_r

DefTruth approved these changes

View reviewed changes

DefTruth requested changes

View reviewed changes

fastdeploy/vision/detection/ppdet/postprocessor.cc Outdated

                   // The fourth output of solov2 is mask
                   return ProcessMask(tensors[3], results);
                 } else {
+                  if (tensors[0].Shape()[2] == 8) {  // PPYOLOER

Collaborator

DefTruth Apr 20, 2023

这个shape是否一定是3维度？picodet等的输出tensor一般是2个维度，索引2会取到一个随机值

thunder95 and others added 2 commits

April 20, 2023 11:30


          fix shape bug

3766d7f


          Merge branch 'develop' into bm1684x_yoloe_r

81961c8

DefTruth approved these changes

View reviewed changes


          Merge branch 'develop' into bm1684x_yoloe_r

e28bd10

DefTruth merged commit 51be3fe into PaddlePaddle:develop

851532562 commented Jul 27, 2023

@thunder95 您好，请问下您使用的ppyoloe_r模型提交到算能model_zoo了吗，我找了好久没找到，您方便发我一下吗？发邮箱（851532562@qq.com）也行。非常感谢！

Contributor Author

thunder95 commented Jul 27, 2023

@851532562 您好，我试着找了下，没有找到该模型，之前借的设备也还了。这个转模型过程相对也比较简单，先导出onnx，然后用最新版mlir转bmodel，老版本的可能会遇到部分算子不支持。

851532562 commented Jul 28, 2023

@851532562 您好，我试着找了下，没有找到该模型，之前借的设备也还了。这个转模型过程相对也比较简单，先导出onnx，然后用最新版mlir转bmodel，老版本的可能会遇到部分算子不支持。

嗯，我按照文档指示能完成模型转换过程，在bm1684x上也能跑起来，不过结果有一些问题，该识别出来的没识别到，有些不对的框置信度还挺高，您遇到过这样的问题吗？

851532562 commented Jul 28, 2023

@thunder95 麻烦帮忙看看哦，感谢！

Contributor Author

thunder95 commented Jul 28, 2023

@851532562 这是自己训练的模型吗，有跟python跑出的结果进行对比吗？之前我测试官网的模型，是有轻微的差别，主要原因是模型移植过程中模型本身会有轻微精度损失，另外，旋转框后处理的实现上也有差别，官网原始实现是有一个cuda算子，具体名称忘记了，我参考的是deploy里numpy处理的逻辑进行移植到fastdeploy。

851532562 commented Jul 28, 2023

@thunder95 您说的python跑的结果是指用paddledetection跑的吗，我等下对比一下。我这个用的是paddledetection导出的，不是自己训练的，导出的时候去掉了最后的multiclass_nms算子，其他的都是按fastdeploy sophgo文档操作的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet