Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inputs format misunderstood. (prepare_input) (AIV-743) #193

Open
3 tasks done
nicklasb opened this issue Jan 12, 2025 · 8 comments
Open
3 tasks done

Inputs format misunderstood. (prepare_input) (AIV-743) #193

nicklasb opened this issue Jan 12, 2025 · 8 comments

Comments

@nicklasb
Copy link

nicklasb commented Jan 12, 2025

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate.
  • Provided a clear description of your suggestion.
  • Included any relevant context or examples.

Issue or Suggestion Description

I am getting an error when I am quantizing an (working, at least I can infer successfully in PaddleDetection) ONNX model.
It is a PaddleDetection model that has been exported to ONNX using paddle2onnx:
PicoDet:
backbone: LCNet
neck: LCPAN
head: PicoHeadV2

..config, basically pedestrian_detect-model (if I understood that lineage properly), that is trained towards other stuff.
But when I am running variant of quantize_torch_model.py, that basically only loads other images, the rest is the same.

     ___________ ____        ____  ____  ____
     / ____/ ___// __ \      / __ \/ __ \/ __ \
   / __/  \__ \/ /_/ /_____/ /_/ / /_/ / / / / 
 / /___ ___/ / ____/_____/ ____/ ____/ /_/ /
/_____//____/_/         /_/   /_/    \___\_\


load imagenet calibration dataset from directory: C:/somepath/PaddleDetection/dataset/sea/images/val
[00:07:20] PPQ Layerwise Equalization Pass Running ... 2 equalization pair(s) was found, ready to run optimization.
Layerwise Equalization:   0%|                                                                                                                                                                                                                      | 0/4 [00:00<?, ?it/s][Conv.24(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.28(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.24(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.28(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.24(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.28(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.24(Type: Conv, Num of Input: 3, Num of Output: 1)]
[Conv.28(Type: Conv, Num of Input: 3, Num of Output: 1)]
Layerwise Equalization: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 53.86it/s] 
Finished.
Traceback (most recent call last):
 File "C:\somepath\esp-dl\tools\quantization\quantize_custom_onnx_model.py", line 144, in <module>
   quant_ppq_graph = espdl_quantize_onnx(
                     ^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\core\defs.py", line 54, in _wrapper
   return func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\api\espdl_interface.py", line 223, in espdl_quantize_onnx
   ppq_graph = quantize_onnx_model(
               ^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\core\defs.py", line 54, in _wrapper
   return func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\api\interface.py", line 263, in quantize_onnx_model
   quantizer.quantize(
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\core\defs.py", line 54, in _wrapper
   return func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\quantization\quantizer\base.py", line 52, in quantize
   executor.tracing_operation_meta(inputs=inputs)
 File "somepath\esp-dl\.venv\Lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
   return func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\core\defs.py", line 54, in _wrapper
   return func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\executor\torch.py", line 616, in tracing_operation_meta
   self.__forward(
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\executor\torch.py", line 474, in __forward
   inputs = self.prepare_input(inputs=inputs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "somepath\esp-dl\.venv\Lib\site-packages\ppq\executor\base.py", line 140, in prepare_input
   assert len(inputs_dictionary) == len(inputs), \
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Inputs format misunderstood. Given inputs has 1 elements, while graph needs 2

inputs:
[tensor([[[[-2.0644,  0.4211,  1.5156,  ...,  0.7785,  1.0445, -0.3174],
         [ 1.0306,  1.6350, -0.0849,  ..., -0.2133, -0.3099, -0.8846],
         [ 1.4076,  1.2802, -0.5059,  ..., -1.2710,  0.1182, -0.9397],
         ...,
         [-0.7491, -1.5121,  1.2965,  ...,  0.5195,  0.7199, -1.5996],
         [-1.9851, -1.9259,  0.6562,  ..., -2.1439,  1.0245,  0.7217],
         [-1.8503, -2.2407,  0.0850,  ...,  0.4948,  2.0727, -2.2183]],

        [[ 1.4071, -1.5879, -0.7559,  ..., -0.9110,  0.4718, -0.3414],
         [ 0.3396, -0.7012,  0.7558,  ..., -0.4906, -0.4403, -0.2816],
         [-1.4492,  1.1551,  0.9532,  ..., -0.9253, -2.3705,  0.6899],
         ...,
         [ 1.6252,  0.1358,  0.2999,  ...,  0.3089,  0.7925,  1.4319],
         [-0.2310, -0.5723, -0.6862,  ..., -1.0324,  0.8473, -0.4547],
         [-0.1850,  0.2352,  0.4126,  ...,  0.7649, -1.9019, -0.2216]],

        [[ 0.2727,  1.5507,  0.2306,  ..., -0.2771, -0.8666, -0.0491],
         [-0.8079,  1.2834, -0.1030,  ...,  0.2184, -0.5272,  0.8095],
         [ 0.2811,  0.6141,  0.6307,  ..., -0.0618, -0.3692,  0.3364],
         ...,
         [-0.5358, -2.4136,  0.8725,  ..., -0.8501, -1.2791, -0.3178],
         [ 0.4313, -1.3117,  0.6791,  ...,  0.1524,  0.6022,  1.5905],
         [ 0.2910,  0.2658, -0.4634,  ..., -0.4612, -0.0362,  0.8774]]]])]
inputs_dictionary:
{'image': image, 'scale_factor': scale_factor}

I printed the output of inputs and inputs_dictionary, because basically this seems like it is just the completely wrong data in the wrong place or something.

@github-actions github-actions bot changed the title Inputs format misunderstood. (prepare_input) Inputs format misunderstood. (prepare_input) (AIV-743) Jan 12, 2025
@100312dog
Copy link
Contributor

@nicklasb Can you share the onnx model file?

@BlueSkyB
Copy link
Collaborator

@nicklasb
Judging from your log output, it seems that the number of input shapes you passed in does not match the number of graph inputs in the ONNX model, which is causing the issue. You can visualize the graph inputs of the ONNX model through Netron. If the model has multiple inputs, when calling the espdl_quantize_onnx interface, input_shape should be a list containing multiple lists, with each list corresponding to the shape of one input. It is important to note that for the shape of the feature map, the batch dimension should be set to 1.

@nicklasb
Copy link
Author

So I think that I have fixes the input shape, however, now I encounter an issue, ReduceMin is not implemented in ESP-DL.
Is this something that might be implemented soon?

@nicklasb
Copy link
Author

Or maybe there is an easier way, how did ESP-DL implement the pedestrian_detect model?
Basically all I want to do is to customize that.

@100312dog
Copy link
Contributor

@nicklasb The model we use looks like onnx.zip, without the last sqrt op. When you export the model to onnx, you should remove the unnecessary part, something like nms.

@nicklasb
Copy link
Author

nicklasb commented Jan 17, 2025

@100312dog Ok. Unsure what you mean, was it some paddle project that generated the model then i suppose (the name looks like it). Not sure what you mean with "the unnecessary part"? And how do I remove it?

@100312dog
Copy link
Contributor

100312dog commented Jan 20, 2025

@nicklasb

python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
              -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams export.post_process=False \
              --output_dir=output_inference

If you are using the offical paddledetection project, use this command to export the model.
Use netron to visualize onnx.
If you don't add the export.post_process option, the model seems like

Image
Image
There're two problems in this model,
The first problem is the batch size is dynamic, but batch size is expected to be 1 in esp-dl.
The second problem is the oprators after the transpose and sqrt,including themself, can be processed outside the model using c code. It is much more efficient and it also avoids some unsupported operators such as NMS(Non-maximum suppression) which is not supported in most inferece frameworks.

After exporting the model using the above cmd, run these command to convert it into onnx.

paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
            --model_filename model.pdmodel  \
            --params_filename model.pdiparams \
            --opset_version 11 \
            --save_file picodet_s_320_coco.onnx

onnxsim picodet_s_320_coco.onnx picodet_s_320_coco_sim.onnx

Then the model seems like:

Image
If you want to use the postprocessing code
https://github.com/espressif/esp-dl/blob/master/esp-dl/vision/detect/dl_detect_pico_postprocessor.cpp
remove the boxed op in the picture.
Use onnx.utils.extract_model api to extract the subgraph, remove the ops.
https://github.com/espressif/esp-dl/blob/master/esp-dl/vision/detect/dl_detect_pico_postprocessor.cpp#L68
replace the output names with your model.

@nicklasb
Copy link
Author

@100312dog This is great information, thank you! This should solve it for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants