-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PostProcessCuda is very slow using my model #71
Comments
Have you found a solution to this? I'm having a similar issue using my own model with one class, except it just gets stuck on inference (after the find pillar_num line). I've also noticed one of the cores on the Xavier is being maxed out while this is happening. |
Unfortunately I have no solution yet :(. If you find any lead, please tell me about it ! |
Hello, |
@GuillaumeAnoufa I am experiencing the same issue. I suspect that it is related to this line as changing the values will still seemingly build the model correct without errors. CUDA-PointPillars/tool/simplifier_onnx.py Line 29 in 092affc
I am also using a single class detector but I am also using a different pointcloud range and voxel size. I am going to train the model with 3 classes to verify if this is an issue with the number of class etc or the pointcloud range |
@byte-deve Hi do you know what are each of these numbers are a product of? '496' and '432' |
Hi, i realised this are the size of the feature grid |
@GuillaumeAnoufa I now have this working with a fully custom model, if you still need support you can @ me :) |
@rjwb1 hey, I'm having the same issues with setting up a custom model, would really appreciate some guidance :) ################## MODEL CONFIG #####################
MODEL:
OPTIMIZATION:
########################## DATASET CONFIG ######################## DATA_SPLIT: { INFO_PATH: { TRAINING_CATEGORIES: { FOV_POINTS_ONLY: False DATA_AUGMENTOR:
POINT_FEATURE_ENCODING: { DATA_PROCESSOR:
|
@mazm0002 hi there, does the model train successfully and work in PyTorch? What stage of the process are you having trouble with? |
@rjwb1 Yea so I can train successfully and get the required outputs I expect. Then I use the onnx exporter tool to convert the model to onnx and run it with the demo feeding it custom test data (that works fine in PyTorch). TensorRT engine generates fine, but then when it actually did detections, they take a long time to process and there are way too many bounding boxes and most of them incorrect. Think the issue is probably in the onnx conversion, was wondering if you could let me know what you had to change in the tool to get it working for 1 class and custom data/model config. Thanks a lot for the help! |
@mazm0002 Hi, I too experienced this and it was due to some hard coded parameters inside the exporter. I also used this useful tool to inspect my generated onnx file to ensure it was similar to the default one: https://github.com/lutzroeder/netron Can you show me what values you have here or is it default? CUDA-PointPillars/tool/simplifier_onnx.py Lines 29 to 45 in 092affc
|
I think maybe with your model it should look like this?
|
The size of the scatter plugin array should be equal to the dimensions of the voxel grid |
I will open a PR to parameterise these values properly :) |
As you are using additional pointcloud attributes (5 instead of 4) this may require further parameters |
Hello @rjwb1 thanks for your inputs ! My config only has a few changes from the default config: My exported model shape seems accurate but I am still experiencing these very long post-processing. |
Hmmm, I am also using a single class... would you mind sending a copy of your cfg file and I will see if I can reproduce this |
Sure: pointpillar2.txt |
@GuillaumeAnoufa looks almost identical to mine. Strange... I guess I also have my score thresh set to 0.4 and my nms thresh to 0.1 in my Params.h. This could reduce post processing latency? |
@rjwb1 It doesn't seem to change anything. I tried exporting the default "pointpillar_7728.pth" model with the default config and just reducing the number of classes from 3 to 1 and experience the same issue on the default data. load file: ../data/data_velo/000001.bin Changing the number of classes in the config file results in a abnomarly high number predicted bounding boxes objects |
@rjwb1 If you try exporting the default model with this config file(which is the default one but with a single class): pointpillar_1class.txt and infer on the default velodyne data do you experience slow post processing ? |
I forgot to copy the generated param.h and recompile after changing the model... |
@GuillaumeAnoufa no worries, glad you found the solution 👍🏼 |
Hi, thanks for your work, I change the files according your pr#77, and I moved parms.h and also recompiled. |
Could you tell me how you solved your question? |
I can export my custom model to onnx, but the result seems incorrect,can you give me some advice |
@rjwb1 |
do you solve your problem? I also get incorrect result when I change the voxel size. |
System:
Ubuntu 20.04
Last version of OpenPcDet
GPU has cuda devices: 1
----device id: 0 info----
GPU : NVIDIA GeForce RTX 2080 with Max-Q Design
Capbility: 7.5
Global memory: 7982MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
Hello,
I exported my pointpillar weights trained on custom data. The only change compared to the example model in parameters is the fact that it only uses 1 class instead of 3.
I had to change a few things in tools/simplifier_onnx.py for the exporter to work with other than 3 classes:
Code changes to work with 1 class
I changed the signature of
simplify_postprocess(onnx_model)
tosimplify_postprocess(onnx_model, num_classes)
and changed 3 other lines.
The exporter works but when testing the demo with this model:
---- RUN TIME ----
load file: ../data/data_velo/000001.bin
find points num: 18630
find pillar_num: 6815
TIME: generateVoxels: 0.038048 ms.
TIME: generateFeatures: 0.053024 ms.
TIME: doinfer: 30.2525 ms.
TIME: doPostprocessCuda: 57528.1 ms.
TIME: pointpillar: 57558.6 ms.
Bndbox objs: 4158
Saved prediction in: ../eval/kitti/object/pred_velo/000001.txt
This model works perfectly fine in pytorch.
As you can see the post process part takes a long time and outputs thousands of bounding boxes.
Issue #43 references a similar problem seemingly solved by an update but I am currently using the most updated version of this repo.
Do you have an idea what could cause this issue?
I can upload my .pth file or my onnx file if you want to try and reproduce this.
Best regards,
The text was updated successfully, but these errors were encountered: